Web Content Extractor
Command Line Options
It is possible to
perform Web Content Extractor commands from the
command line. To launch the program at a specified
time, you can use "Windows Task Scheduler",
or any other scheduler, controlling the launch
parameters from command line. Possible key prefixes
are "-" and "/".
Syntax: WCExtractor.exe
["projectfile"] [-dr] [-dt] [-rt] [-at"filename"]
[-s] [-ddr] [-fr["column name{criteria index}value"]]
[-qe["filename"]][-ex]
Projectfile - the
file name of the project (*.wcepr) to open.
Key |
Command |
-dr |
delete all results/records |
-dt |
delete all tasks |
-rt |
reset all tasks |
-at"filename" |
add new tasks from file, filename
- name of the CSV or TXT file that contains
URLs separated by newlines. |
-s |
start the extraction process |
-ddr |
delete duplicate records |
-fr["column
name{criteria index}value"] |
filter results, criteria index:
0 - Contains, 1 - DoesNotContain, 2 - Equals,
3 - DoesNotEqual, 4 - BeginsWith, 5 - DoesNotBeginWith,
6 - EndsWith, 7 - DoesNotEndWith |
-qe["filename"] |
export results/records, filename
- name of the output file. |
-ex |
exit when all tasks are done. |
Examples
To launch the program,
then open the "myproject.wcepr" project
file, delete all previous results, reset all tasks,
start the extraction process, delete duplicate
records, export data and close the program, you
should use the following command:
"C:\Program
Files\Web Content Extractor\WCExtractor.exe"
"C:\Program Files\Web Content Extractor\myproject.wcepr"
-dr -rt -s -ddr -qe -ex
To launch the program,
then open the "myproject.wcepr" project
file, delete all previous tasks, add new tasks
from "urls.csv" file, start executing
tasks, delete duplicate records and close the
program, you should use the following command:
"C:\Program
Files\Web Content Extractor\WCExtractor.exe"
"C:\Program Files\Web Content Extractor\myproject.wcepr"
-dt -at"C:\Program Files\Web Content Extractor\urls.csv"
-s -ddr -ex
Note: The program
will export data, using the export configuration,
which was the last to be executed in the project.
If the project has never been exported, then this
function is not available.
|