Options for listing the files in a directory with Python

http://www.saltycrane.com/blog/2010/04/options-listing-files-directory-python/

Options for listing the files in a directory with Python

I do a lot of sysadmin-type work with Python so I often need to list the contents of directory on a filesystem. Here are 4 methods I’ve used so far to do that. Let me know if you have any good alternatives. The examples were run on my Ubuntu Karmic machine.

OPTION 1 - os.listdir()

This is probably the simplest way to list the contents of a directory in Python.

import os
dirlist = os.listdir("/usr")

from pprint import pprint
pprint(dirlist)

Results:

['lib',
 'shareFeisty',
 'src',
 'bin',
 'local',
 'X11R6',
 'lib64',
 'sbin',
 'share',
 'include',
 'lib32',
 'man',
 'games']

OPTION 2 - glob.glob()

This method allows you to use shell-style wildcards.

import glob
dirlist = glob.glob('/usr/*')

from pprint import pprint
pprint(dirlist)

Results:

['/usr/lib',
 '/usr/shareFeisty',
 '/usr/src',
 '/usr/bin',
 '/usr/local',
 '/usr/X11R6',
 '/usr/lib64',
 '/usr/sbin',
 '/usr/share',
 '/usr/include',
 '/usr/lib32',
 '/usr/man',
 '/usr/games']

OPTION 3 – Unix “ls” command using subprocess

This method uses your operating system’s “ls” command. It allows you to sort the output based on modification time, file size, etc. by passing these command-line options to the “ls” command. The following example lists the 10 most recently modified files in /var/log:

from subprocess import Popen, PIPE

def listdir_shell(path, *lsargs):
    p = Popen(('ls', path) + lsargs, shell=False, stdout=PIPE, close_fds=True)
    return [path.rstrip('\n') for path in p.stdout.readlines()]

dirlist = listdir_shell('/var/log', '-t')[:10]

from pprint import pprint
pprint(dirlist)

Results:

['auth.log',
 'syslog',
 'dpkg.log',
 'messages',
 'user.log',
 'daemon.log',
 'debug',
 'kern.log',
 'munin',
 'mysql.log']

OPTION 4 – Unix “find” style using os.walk

This method allows you to list directory contents recursively in a manner similar to the Unix “find” command. It uses Python’s os.walk.

import os

def unix_find(pathin):
    """Return results similar to the Unix find command run without options
    i.e. traverse a directory tree and return all the file paths
    """
    return [os.path.join(path, file)
            for (path, dirs, files) in os.walk(pathin)
            for file in files]

pathlist = unix_find('/etc')[-10:]

from pprint import pprint
pprint(pathlist)

Results:

['/etc/fonts/conf.avail/20-lohit-gujarati.conf',
 '/etc/fonts/conf.avail/69-language-selector-zh-mo.conf',
 '/etc/fonts/conf.avail/11-lcd-filter-lcddefault.conf',
 '/etc/cron.weekly/0anacron',
 '/etc/cron.weekly/cvs',
 '/etc/cron.weekly/popularity-contest',
 '/etc/cron.weekly/man-db',
 '/etc/cron.weekly/apt-xapian-index',
 '/etc/cron.weekly/sysklogd',
 '/etc/cron.weekly/.placeholder']

6 Comments — feed icon Comments feed for this post

#1 Keith Beattie commented on 2010-05-17:Adding a regexp to your option #1 is a quick way to get python’s re module into play when sh regexps won’t cut it:

import os, pprint, re

pat = re.compile(r".+\d.+")
dirlist = filter(pat.match, os.listdir("/usr/local"))

pprint.pprint(dirlist)

gives me (on my FreeBSD box)

['diablo-jdk1.6.0',
 'netbeans68',
 'openoffice.org-3.2.0',
 'i386-portbld-freebsd7.3']
#2 Eliot commented on 2010-05-18:Keith: That’s a good tip. I will give it a try the next time I get a chance. Thanks!

#3 Al Jaffe commented on 2010-06-14:…and how about an easy way for listing contents of a WEB directory? Could any of the above techniques be used?

#4 Directory commented on 2010-11-23:I’m just learning python for my job and this has been a really useful reference page for me!! I realise it’s only really useful for one thing – but the methods you’ve shown are perfect for particular types of directory listings in my code ;) .

#5 gsiliceo commented on 2011-04-16:I recently started learning python and i love your blog i’m constantly looking for best practices and “solved” problems

#6 Eriksen commented on 2011-08-22:I’m also just learning python for my job and this has been a really useful reference page for me.

I hope you can post more about system administration booth Unix and Windows.

Keep up the good work man ;)