torstai 25. syyskuuta 2008

Downloading all specific files from web server to your Linux desktop

Everyone knows that there is the wget command in the Ubuntu. But by default its behavior is not very nice if you want to get for example backup of your home directory on the web server only. If you don't have wget, you can install it by going to shell and typing:


sudo apt-get install wget


Example case: You have lots of pictures on your home page, in images folder. You want to get them all to your Linux desktop, ie. you want to make a backup. But you want only the images, not the other files that are in that folder, what do you do? Or what if you want to download all my music without needing to click download for provided links for each file separately?

I was asking around from the Linux gurus around me, but I could not get a simple answer other than read the man pages. Ok, I went and read the man pages. There were some examples, and they did not do what they promised to do until I combined them to the following:

This does to my music folder on our server a mirror to your hard disk, but it has adverse side effect, it keeps the web server directory stucture:


wget -r -l1 --no-parent -L -A.mp3 http://www.katix.org/karoliina/music/


I just wanted to have the mp3-files from one folder. So I searched further the man page.

Here is the line what did for me what I wanted:


wget -r -l1 --no-parent -L -A.mp3 -p --convert-links -nH -nd -P./ http://www.katix.org/karoliina/music/


It still downloads the robots.txt. But you can go and delete the unnecessary file. And if you want all my music, here is what you can do. In shell (terminal), go to your desired folder and then copy-paste this command line to the shell. And you'll be done as quickly as the network allows.