Options for listing the files in a directory with Python
http://www.saltycrane.com/blog/2010/04/options-listing-files-directory-python/
Options for listing the files in a directory with Python
I do a lot of sysadmin-type work with Python so I often need to list the contents of directory on a filesystem. Here are 4 methods I’ve used so far to do that. Let me know if you have any good alternatives. The examples were run on my Ubuntu Karmic machine.
OPTION 1 - os.listdir()
This is probably the simplest way to list the contents of a directory in Python.
import os dirlist = os.listdir("/usr") from pprint import pprint pprint(dirlist)
Results:
['lib', 'shareFeisty', 'src', 'bin', 'local', 'X11R6', 'lib64', 'sbin', 'share', 'include', 'lib32', 'man', 'games']
OPTION 2 - glob.glob()
This method allows you to use shell-style wildcards.
import glob dirlist = glob.glob('/usr/*') from pprint import pprint pprint(dirlist)
Results:
['/usr/lib', '/usr/shareFeisty', '/usr/src', '/usr/bin', '/usr/local', '/usr/X11R6', '/usr/lib64', '/usr/sbin', '/usr/share', '/usr/include', '/usr/lib32', '/usr/man', '/usr/games']
OPTION 3 – Unix “ls” command using subprocess
This method uses your operating system’s “ls” command. It allows you to sort the output based on modification time, file size, etc. by passing these command-line options to the “ls” command. The following example lists the 10 most recently modified files in /var/log
:
from subprocess import Popen, PIPE def listdir_shell(path, *lsargs): p = Popen(('ls', path) + lsargs, shell=False, stdout=PIPE, close_fds=True) return [path.rstrip('\n') for path in p.stdout.readlines()] dirlist = listdir_shell('/var/log', '-t')[:10] from pprint import pprint pprint(dirlist)
Results:
['auth.log', 'syslog', 'dpkg.log', 'messages', 'user.log', 'daemon.log', 'debug', 'kern.log', 'munin', 'mysql.log']
OPTION 4 – Unix “find” style using os.walk
This method allows you to list directory contents recursively in a manner similar to the Unix “find” command. It uses Python’s os.walk
.
import os def unix_find(pathin): """Return results similar to the Unix find command run without options i.e. traverse a directory tree and return all the file paths """ return [os.path.join(path, file) for (path, dirs, files) in os.walk(pathin) for file in files] pathlist = unix_find('/etc')[-10:] from pprint import pprint pprint(pathlist)
Results:
['/etc/fonts/conf.avail/20-lohit-gujarati.conf', '/etc/fonts/conf.avail/69-language-selector-zh-mo.conf', '/etc/fonts/conf.avail/11-lcd-filter-lcddefault.conf', '/etc/cron.weekly/0anacron', '/etc/cron.weekly/cvs', '/etc/cron.weekly/popularity-contest', '/etc/cron.weekly/man-db', '/etc/cron.weekly/apt-xapian-index', '/etc/cron.weekly/sysklogd', '/etc/cron.weekly/.placeholder']
Related posts
- How to get the filename and it’s parent directory in Python — posted 2011-12-28
- How to remove ^M characters from a file with Python — posted 2011-10-03
- Monitoring a filesystem with Python and Pyinotify — posted 2010-04-09
- os.path.relpath() source code for Python 2.5 — posted 2010-03-31
- A hack to copy files between two remote hosts using Python — posted 2010-02-08
6 Comments — Comments feed for this post
import os, pprint, re
pat = re.compile(r".+\d.+")
dirlist = filter(pat.match, os.listdir("/usr/local"))
pprint.pprint(dirlist)
gives me (on my FreeBSD box)
['diablo-jdk1.6.0',
'netbeans68',
'openoffice.org-3.2.0',
'i386-portbld-freebsd7.3']
I hope you can post more about system administration booth Unix and Windows.
Keep up the good work man