I have tried running httrack website copier from commandline on Ubuntu machine. Though I followed filters as mentioned in their website, I didn't see they are working. It's mirroring the entire site.
I have tried the following commands:
-
To download only png files
httrack https://www.elastic.co/ +.png -O "/tmp/output" --robots=0 --user-agent="Mozilla/5.0" -v httrack [url] - +*.png -O "/tmp/output" --robots=0 --user-agent="Mozilla/5.0" -v
-
To download all less than 50KB
httrack https://www.elastic.co/ -*[<50] -O "/tmp/output" --robots=0 --user-agent="Mozilla/5.0" -v
I have tried different other types of filters and approaches but I couldn't see working.
Please let me know if I am wrong anywhere.
Recent Questions...
ما را در سایت Recent Questions دنبال میکنید
برچسب:
نویسنده: استخدام کار
بازدید: 253
تاريخ: شنبه
29 خرداد
1395 ساعت: 18:53