[html tags]VPID:[html tags]123h[html tags]
It might help if you posted some actual lines, plus what you want for the output.
You can do away with the grep part and let sed do the selecting:
sed '/VPID/s/.*VPID:<html_tag>\(...h\).*/VPID: \1/p' yourfile >input_file_for_expect
The \( \) pair save what is in between, and you can reuse it in the replacement part. Here I assumed that there is only one VPID number per line. Generally, what you do is identify a string or regex expression that acts as an anchor so that you save the part you want. Outside the \( ... \) part of the regex are the anchors.
This looks a bit similar to how I use the saved .K3B file to filter out the filenames backed up and pipe it to xargs to remove them.
sed -e '/^<url>/!d' -e 's/<url>\(.*\)<\/url>/\1/' maindata.xml | tr '\n' '\000' | xargs -0 rm
Remember to include a backslash before a forward slash if you match '/' in the pattern, as in a closing tag. I left out the sed commands which handle '&' -> '&', '>' -> '>' and '<' -> '<' so the line I posted wouldn't get to long. The '-e' option precedes each sed command, allowing you to process each line more than once per sed command. The first sed command "-e '/^<url>/!d removes lines that don't contain filenames. The "<url>" here is literally what is in the xml file. The "<url>" and "<\/url>" parts in the second command are the placemarkers. In between is the file that was backed up. Similar to your expect issue, a filename may contain whitespace. So I pipe the output through the "tr" command to replace returns with NULLs. The output then is just as if it came from a find command using the "-print0" argument. So I can use "xargs -0".