Improve Pacman Performance
From ArchWiki
i18n |
---|
English |
Deutsch |
Português |
Русский |
Türkçe |
繁体中文 |
简体中文 |
Contents |
Improving database access speeds
Pacman stores all package information in a collection of small files, one for each package. Improving database access speeds reduces the time taken in database-related tasks, e.g. searching packages and resolving package dependencies.
The safest and easiest method is to run
# pacman-optimize && sync
as root. This will attempt to put all the small files together in one (physical) location on the hard disk so that the hard disk head does not have to move so much when accessing all the packages. This method is safe, but is not for-sure. It depends on your filesystem, disk usage and empty space fragmentation.
Further tweaks
ody has posted a script on the forum that replaces the current Pacman database with a loopback filesystem which ensures that all the small files continue to stay together on the hard disk. Several users have reported great improvements, but problems have also been reported so do not do this unless you are an expert user.
To use ody's script you must have a kernel compiled with loopback filesystem support. The default kernels already have this, so you only need to be concerned with this if you compile your own custom kernel.
Improving download speeds
Firstly, if your download speeds have been reduced to a crawl, ensure you are using one of the many mirrors and not ftp.archlinux.org, which, as of March 2007, is now throttled.
Pacman's speed in downloading packages can be improved by using a different application to download packages instead of Pacman's built-in file downloader.
In all cases, make sure you have the latest Pacman before doing any modifications.
# pacman -Sy pacman
Using wget
This is also very handy if you need more powerful proxy settings than pacman's built-in capabilities.
To use wget
, first install it with pacman -S wget
and then modify /etc/pacman.conf
by adding the following line to the [options]
section:
XferCommand = /usr/bin/wget -c --passive-ftp -c %u
Instead of putting wget
parameters in /etc/pacman.conf
, you can also modify the wget
configuration file directly (the system-wide file is /etc/wgetrc
, per user files are $HOME/.wgetrc
).
Using aria2
According to the aria2 website, aria2 is "a download utility with resuming and segmented downloading. Supports HTTP/HTTPS/FTP/BitTorrent/Metalink." This means that you can make several HTTP/FTP connections to an Arch mirror at the same time, which should result in an increase in download speeds.
Install it with pacman -S aria2
and then edit /etc/pacman.conf
by adding the following line to the [option]
section:
XferCommand = /usr/bin/aria2c --no-conf -s 2 -m 2 -d / -o %o %u
Let's run over the options here:
- /usr/bin/aria2c - the location of the aria2 application
- --no-conf - do not use a parameterized configuration file if one is available in ~/.aria2 (typically ~/.aria2/aria2.conf)
- -s 2 - use 2 concurrent connections (you can set this higher if you want, but it's not going to do a whole lot)
- -m 2 - make 2 attempts to download the package per mirror
- -o %o - output to the file pacman specifies
- %u - download the file pacman specifies
Powerpill
Powerpill is a wrapper for pacman that uses aria2 to download packages. Unlike the other aria2 solutions, powerpill uses simultaneous downloads for all files and segmented downloads only for larger files, which really makes the most of your bandwidth without wasting time splitting small files unnecessarily. Powerpill is available in the community repo.
# pacman -S powerpill
For more info, see the Powerpill wiki article.
Using airpac
In a nutshell, airpac is an aria2c wrapper for pacman. Unlike powerpill, which acts as a frontend to pacman, airpac serves as a backend downloader for pacman. On the other hand, however, it behaves similarly to powerpill, as far as downloading is concerned, since both use aria2c to actually download the files. Because it is a backend though, it cannot download multiple packages simultaneously as powerpill can.
Essentially, airpac is the Python implementation of the pacget script below. However, the main difference lies in the handling of aria2c output. airpac shows only the most relevant info, i.e., the download progress, although it currently doesn't use a progressbar (maybe in the near future). Also, airpac caches the db files so that they won't be downloaded for every pacman -Sy
. On the downside, this breaks pacman -Syy
since airpac has no way of knowing the options pacman is executed with. As a workaround, however, one can use pacman -Sc
to delete the cached files in /var/lib/pacman/.airpac.
The configuration file is located in /etc/airpac.conf. This is actually an aria2c config file. Because of this, the user can directly configure how aria2c is used by airpac without meddling with airpac's code. For more info about the available options, consult the aria2c manpage.
airpac also uses the Server Performance Profile feature of aria2c by default. The statistics file is located in /var/lib/airpac.stats. The default URI selector is adaptive.
Usage in /etc/pacman.conf
XferCommand = /usr/bin/airpac %u %o
pacget (aria2) Mirror Script
This script will greatly improve the download speed for broadband users. It uses the servers in /etc/pacman.d/mirrorlist as mirrors in aria2. What happens is that aria2 downloads from multiple servers simultaneously which gives a huge boost in download speed.
Take note that you have to put 'exec' before /usr/bin/pacget in the XferCommand. This is needed so that when you terminate pacget or aria2 (with process id used by pacget), pacman would also terminate. This would prevent inconvenience because Pacman would not persist downloading a file when you tell it not to.
WARNING: You may experience some problems if the mirrors used are out-of-sync or are simply not up-to-date. Just use the Reflector script to generate a list of up-to-date and fast mirrors. Also, ftp.archlinux.org resolves to two IPs. You may want to choose only one of them and hard code ftp.archlinux.org and the chosen IP address to /etc/hosts.
/usr/bin/pacget
#!/bin/bash msg() { echo "" echo -e " \033[1;34m->\033[1;0m \033[1;1m${1}\033[1;0m" >&2 } error() { echo -e "\033[1;31m==> ERROR:\033[1;0m \033[1;1m$1\033[1;0m" >&2 } CONF=/etc/pacget.conf STATS=/etc/pacget.stats ARIA2=$(which aria2c 2> /dev/null) # ----- do some checks first ----- if [ ! -x "$ARIA2" ]; then error "aria2c was not found or isn't executable." exit 1 fi if [ $# -ne 2 ]; then error "Incorrect number of arguments" exit 1 fi filename=$(basename $1) server=${1%/$filename} # Determine which repo is being used repo=$(awk -F'/' '$(NF-2)~/^(community|core|extra|testing)$/{print $(NF-2)}' <<< $server) [ -z $repo ] && repo="custom" # For db files, or when using a custom repo (which most likely doesn't have any mirror), # use only the URL passed by pacman; Otherwise, extract the list of servers (from the include file of the repo) to download from url=$1 if ! [[ $filename = *.db.tar.gz || $repo = "custom" ]]; then mirrorlist=$(awk -F' *= *' '$0~"^\\["r"\\]",/Include *= */{l=$2} END{print l}' r=$repo /etc/pacman.conf) if [ -n mirrorlist ]; then num_conn=$(grep ^split $CONF | cut -d'=' -f2) url=$(sed -r '/^Server *= */!d; s/Server *= *//; s/\$repo'"/$repo/; s:$:/$filename:" $mirrorlist | head -n $(($num_conn * 2))) fi fi msg "Downloading $filename" cd /var/cache/pacman/pkg/ touch $STATS $ARIA2 --conf-path=$CONF --max-tries=1 --max-file-not-found=5 \ --uri-selector=adaptive --server-stat-if=$STATS --server-stat-of=$STATS \ --allow-overwrite=true --remote-time=true --log-level=error --summary-interval=0 \ $url --out=${filename}.pacget && [ ! -f ${filename}.pacget.aria2 ] && mv ${filename}.pacget $2 && chmod 644 $2 exit $?
/etc/pacget.conf
# The log file log=/var/log/pacget.log # Number of servers to download from split=5 # Maximum download speed (0 = unrestricted) max-download-limit=0 # Minimum download speed (0 = don't care) lowest-speed-limit=0 # Server timeout period timeout=5 # Passive FTP or not ftp-pasv=true # 'none' or 'prealloc' file-allocation=none
Save this script as /usr/bin/pacget.
chmod 755 /usr/bin/pacget
This makes the script an executable
In /etc/pacman.conf, in the [options] section, the following needs to be added:
XferCommand = exec /usr/bin/pacget %u %o
PS: If you use ftp.archlinux.org as the first server listed in your include files (/etc/pacman.d/*), some problems may occur when the mirrors you are using have not yet synced. To make great use of this script, choose a mirror (that syncs in a timely manner) that is more appropriate for you, then put that on top of the server lists. This is to prevent downloading only from ftp.archlinux.org when the mirrors have not yet synced. The rankmirrors python script can be useful in this case.
Using other applications
There are other downloading applications that you can use with Pacman. Here they are, and their associated XferCommand settings:
-
snarf
:XferCommand = /usr/bin/snarf -N %u
-
lftp
:XferCommand = /usr/bin/lftp -c pget %u
-
axel
:XferCommand = /usr/bin/axel -n 2 -v -a -o %o %u
Choosing the fastest mirror
When downloading packages pacman uses the mirrors in the order they are in /etc/pacman.d/mirrorlist.The mirror which is at the top of the list by default however may not be the fastest for you.
Choosing a local mirror
The simple way is to edit mirrorlist file by placing a local mirror at the top of the list. pacman will then use this mirror for preference.
Alternativley the pacman.conf file can be edited by placing a local mirror before the line sourcing the mirrorlist file, i.e. where it says "add your preferred servers here". It is safer if you use the same server for each repository.
Using rankmirror
You can use rankmirrors to rank pacman mirrors by their connection and opening speed.
Backup the original in case any problems come up:
mv /etc/pacman.d/mirrorlist /etc/pacman.d/mirrorlist.org
Then run rankmirrors to test and add the five fastest mirrors:
rankmirrors -n 5 /etc/pacman.d/mirrorlist.org > /etc/pacman.d/mirrorlist
See the help for more information.
rankmirrors -h
After changing mirrors
After changing your mirror it is a good idea to refresh the pacman database. Using two y's forces a download of a fresh copy of the master package list from the server even if they are thought to be up to date.
# pacman -Syy
Sharing packages over your LAN
If you happen to run several Arch boxes on your LAN, you can share packages so that you can greatly decrease your download times. Keep in mind you should not share between different architectures (i.e. i686 and x86_64) or you'll get into troubles. There are actually 2 ways to achieve this :
The do-it-yourself way
Get your hands dirty: http://wiki.archlinux.org/index.php/Howto_Upgrade_via_Home_Network
The easy way
Install and configure Xyne's pkgd: http://xyne.archlinux.ca/info/pkgd