I'm just trying to get the python output into a variable right away instead of fiddling around as in the second example. Can it be done?
SETLOCAL EnableDelayedExpansion
FOR /F "tokens=1,2 delims= " %%a IN (file.txt) DO (
FOR /F "tokens=* USEBACKQ" %%g IN (`echo %%a | python -m base64 -d`) do (SET "decode1=%%g")
<nul set /p =!decode1! >> new2.txt
echo. >> new2.txt
)
Here is the structure of my file.txt. It is simple. It has multiple lines (a few hundred) with 2 columns of base64 that need decoding into a new.txt file with a space between each column just like in file.txt.
file.txt
FOR /F "tokens=1,2 delims= " %%a IN (file.txt) DO (
echo %%a | python -m base64 -d > temp.txt
set /p var1=<temp.txt
<nul set /p =!var1! >> new.txt
echo. >> new.txt
)
I'm trying to run this base64 decode in a FOR loop - SETLOCAL EnableDelayedExpansion FOR
Q2FTZlpsZXhlR2dWNWV2Mkxsd0JQdw== Ylk5VEh3M1dRSFhqUHV2WjlMcURyZw==
ZDRhMHhHbndLTDBKa3A4S0piSlB2dw== NWxVODVSNEJUUVZpMGx0UHNERVJvQQ==
RHBHTEFfS0poWWlXbnI3c3NFUzlHQQ== RUcwb3dUWmNYM2UtMzBKVFhOWk1uQQ==
Z3RvanZRMUloMzhsYjkyVXNpNTZ1Zw== LW5rLXdndmYyYzRDLW9oNWg1Nk1Udw==
python -c "import base64, sys; f=open('file.txt', 'rb'); [print(base64.b64decode(lines.split()[0]).decode(), base64.b64decode(lines.split()[1]).decode()) for lines in f]; f.close()" > new2.txt
This avoids the fiddling around with a for
loop...set /p
etc. and lets Python read file.txt
and redirect the output to new2.txt
. If output file encoding is an issue then perhaps Python could do the write as well.
Tested with Python 3.8 if that matters.
The main issue in your code is the lack of escaping as demonstrated in this related answer and also this related comment:
FOR /F "tokens=* USEBACKQ" %%g IN (`echo %%a | python -m base64 -d`) do (SET "decode1=%%g")
should read:
FOR /F "tokens=* USEBACKQ" %%g IN (`echo %%a ^| python -m base64 -d`) do (SET "decode1=%%g")
in order to hide the pipe |
from the first parsing phase that processes the whole for /F
loop definition.
In addition, I recommend to use set /p ="!decode1!"
rather than set /p =!decode1!
in order to preserve surrounding quotation marks that could be present in the value of variable decode1
.
Anyway, I want to provide an alternative method of your conversion approach, based on the tool certutil
, which is available since Windows XP, I think:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_SEP= " & rem // (separator character or string for the output)
set "_TMP=%TEMP%\%~n0_%RANDOM%.tmp" & rem // (temporary file for conversion of single items)
set "_CHR=%TEMP%\%~n0_%RANDOM%.chr" & rem // (temporary file to hold just a given separator)
rem // Create a temporary file that only contains a single separator as specified:
> nul forfiles /P "%~dp0." /M "%~nx0" /C "cmd /V /C ^> 0x22!_CHR!0x22 echo(!_SEP!0x1A"
> nul copy "%_CHR%" /A + nul "%_CHR%" /B
rem // Loop through all non-empty lines read from the console input:
for /F "tokens=* eol= " %%L in ('more') do (
rem // Explicitly deny wildcard characters that would cause problems with a `for` loop later:
for /F "delims=*?<> eol=*" %%C in ("%%L") do if not "%%C"=="%%L" (
>&2 echo "%%L" contains forbidden characters!
) else (
rem // Loop through whitespace-separated items in the current line:
set "FIRST=#"
for %%I in (%%L) do (
rem // Check current items against list of all valid characters for Base64 encoding:
(
for /F "delims=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/= eol==" %%D in ("%%I") do rem/
) && (
>&2 echo "%%I" is not valid Base64 encoding!
) || (
rem // Actually perform the Base64 decoding of the current item:
> "%_TMP%" echo(%%I
> nul certutil -f -v -decode "%_TMP%" "%_TMP%"
rem // Write the resulting data to the console output:
if not defined FIRST type "%_CHR%"
type "%_TMP%"
set "FIRST="
)
)
if not defined FIRST echo/
)
)
rem // Clean up temporary files:
del "%_TMP%" "%_CHR%"
endlocal
exit /B
This script, let us call it decode-b64-chunks.bat
, reads Base64-encoded data from the console and writes the decoded data to the console. To convert the sample input file file.txt
from your question to a new file in the current working directory, use the following command line:
decode-b64-chunks.bat < "file.txt" > "new.txt"
The resulting file named new.txt
will eventually contain the following text:
CaSfZlexeGgV5ev2LlwBPw bY9THw3WQHXjPuvZ9LqDrg d4a0xGnwKL0Jkp8KJbJPvw 5lU85R4BTQVi0ltPsDERoA DpGLA_KJhYiWnr7ssES9GA EG0owTZcX3e-30JTXNZMnA gtojvQ1Ih38lb92Usi56ug -nk-wgvf2c4C-oh5h56MTw
The above script contains two particular portions of code I want to go into:
Explicitly deny wildcard characters that would cause problems with a for
loop later:
for /F "delims=*?<> eol=*" %%C in ("%%L") do if not "%%C"=="%%L" (
>&2 echo "%%L" contains forbidden characters!
) else (
…
)
%%L
contains the current line string. The delims
option of the for /F
loop lists all wildcard characters (*
and ?
are well known, but <
and >
are undocumented ones). So if the current line contains one such character, for /F
will tokenise it in some way, so its meta-variable %%C
does not equal the original line string in %%L
, then the whole line is skipped and an error message is returned.
The reason for excluding such lines is the loop for %%I in (%%L) do
, which is used to walk through the whitespace-separated items in a line. A standard for
loop (no /F
switch) is intended to loop through files, but it actually only accesses the file system when at least a wildcard character is encountered, which is what I want to prevent.
Check current items against list of all valid characters for Base64 encoding:
)
for /F "delims=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/= eol==" %%D in ("%%I") do rem/
) && (
>&2 echo "%%I" is not valid Base64 encoding!
) || (
…
)
The delims
option lists all characters that are permitted in a Base64-encoded string. Now if an item %%I
contains only such characters, it is a delimiter-only line, which for /F
skips; if any other character occurs, for /F
processes the text, although nothing happens as there is only rem/
in the loop body. Anyway, I just want to know whether for /F
iterates or not, because then I know if the processed item is valid Base64-encoded data.
The clue here is that for /F
(opposed to any other for
loops) resets the exit code when it iterates at least once and sets it when it does not. The conditional operator &&
executes the following code only when the exit code is zero, so when the for /F
loop iterated, which means a forbidden character is encountered, in which case the current item is skipped and an error message is returned. The conditional operator ||
executes the following block when the exit code is not zero, so the item contains valid Base64-encoded data.