Excel On The Command Line

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Excel on the Command Line

Jon Allen
http://perl.jonallen.info - jj@jonallen.info
Perl ‘one-liners’
• Perl’s command-line options can be used to
manipulate data without writing a full script
perl -le ‘print 2+2’
perl -nle ‘/pattern/ and print’ file.txt

• Improves on standard UNIX tools (grep, sed, awk)


• Can use all features of Perl (special variables,
regular expressions, etc.)

Excel on the Command Line perl.jonallen.info


Perl command-line options
• Common options used for one-liners:
-e = execute Perl code
-n = non-printing input loop
-p = printing input loop
-l = automatic line-end processing
-a = autosplit fields to @F
-F = field separator for -a
-i = in-place file editing

• See ‘perldoc perlrun’ for the complete list


• Read ‘Minimal Perl’ (by Tim Maher, published by
Manning) for an in-depth tutorial

Excel on the Command Line perl.jonallen.info


Not all data is text
• However…
• These tools are limited to processing text data - it
is not possible to ‘grep’ a Word document or a PDF
file
• Not always a problem - log files, config data, are
often stored as plain text

• For non-techies, predominant file format is


Microsoft Excel (.xls)

Excel on the Command Line perl.jonallen.info


Excel - Perl for Windows?
• No, not really! :-)
• But there are similarities
– Ubiquitous
– Versatile
– Low barrier to entry - simple learning curve
– DWIM

• Perl = Swiss Army Chainsaw of the Internet


• Excel = Swiss Army Chainsaw of Business

Excel on the Command Line perl.jonallen.info


Introducing the Excel modules
• Always search the CPAN - the hard work may
already have been done!

• Reading:
– Spreadsheet::ParseExcel
– http://search.cpan.org/dist/Spreadsheet-ParseExcel
• Writing:
– Spreadsheet::WriteExcel
– http://search.cpan.org/dist/Spreadsheet-WriteExcel

Excel on the Command Line perl.jonallen.info


Spreadsheet::ParseExcel
• Read the value of cell A1 in the first worksheet
#! /Users/jj/bin/perl

use strict;
use warnings;
use Spreadsheet::ParseExcel;

my $file = shift @ARGV;


my $excel = Spreadsheet::ParseExcel::Workbook->Parse($file);
my $worksheet = $excel->Worksheet(0);
my $cell = $worksheet->Cell(0,0);

print $cell->Value,"\n";

Excel on the Command Line perl.jonallen.info


Spreadsheet::WriteExcel
• Create a new Excel file with a message in cell A1
#! /Users/jj/bin/perl

use strict;
use warnings;
use Spreadsheet::WriteExcel;

my $file = shift @ARGV;


my $excel = Spreadsheet::WriteExcel->new($file);
my $worksheet = $excel->add_worksheet();

$worksheet->write('A1','Hello, World!');

• Huge amount of Excel features supported

Excel on the Command Line perl.jonallen.info


Back to the command line
• The CPAN Excel modules work well, but they are
not really suitable for writing one-liners:

perl -MSpreadsheet::ParseExcel -e ‘$excel =


Spreadsheet::ParseExcel::Workbook->Parse(“file.xl

• We’ve run out of space before even loading a file!


• Need new tools…

Excel on the Command Line perl.jonallen.info


XLSgrep
• XLSgrep - proof of concept, simple replacement for
UNIX ‘grep’ command
XLSgrep ‘<pattern>’ file.xls

• Iterates over each cell in the spreadsheet, printing


those that match <pattern>
• See http://perl.jonallen.info/projects/xlstools
• Works, but very limited

Excel on the Command Line perl.jonallen.info


XLSgrep - source extract
use open IN => ":bytes";

my $excel = Spreadsheet::ParseExcel->new(
NotSetCell => 1,
CellHandler => sub {
my ($workbook,$sheet_index,$row,$col,$cell) = @_;
if (my $value = $cell->Value) {
chomp $value;
print "$value\n" if ($value =~ /$pattern/);
}
}
);

undef $/;
while (<>) {
$excel->Parse(\$_);
}

Excel on the Command Line perl.jonallen.info


XLSperl
• XLSperl - aims to allow any Perl one-liner to run
against an Excel file with little or no modification

• Standard ‘grep’ command:

perl -nle ‘/pattern/ and print’ file.txt


XLSperl -nle ‘/pattern/ and print’ file.xls

• See http://perl.jonallen.info/projects/xlstools

Excel on the Command Line perl.jonallen.info


XLSperl - options
• Supports the following command-line options:
-e ‘code’ = Perl code to execute (required!)
-n = Assumes an input loop iterating
over each cell (default)
-a = Changes input loop to iterate over
each row, splitting cells to @F and %F
-F = Set input record separator (when
processing text data)
-p = Prints $_ at the end of each loop
-l = Automatic line-end processing
-w = Enable warnings

Excel on the Command Line perl.jonallen.info


XLSperl - variables
• Adds new special variables:
$WS = Worksheet name
$ROW = Current row (1 .. x)
$COL = Current column name (A .. x)
$COLNUM = Current column number (1 .. x)
$CELL = Current cell (A1 .. ZZx)
@F = Array of cell values (in autosplit mode)
%F = Hash of cell values (in autosplit mode)

• @F and %F are tied (modifying $F[0] changes $F{A})


• $COL and $CELL are unavailable in autosplit mode

Excel on the Command Line perl.jonallen.info


XLSperl - creating Excel files
• Adds new command - XLSprint
– Write a single row:
XLSprint “Hello”,“World”;
– Write to multiple filehandles (but don’t print plain text to
the same files!):
XLSprint STDOUT @cells;
XLSprint STDERR @errors;

• XLSperl can also process standard text documents,


enabling easy conversion to Excel format

Excel on the Command Line perl.jonallen.info


XLSperl - examples
• Search multiple files and print matching cell
locations:
XLSperl -nle ‘/pattern/ and print
“$ARGV:$WS:$CELL - $_”’ *.xls

• Convert Excel file to CSV format:


XLSperl -lane ‘print join “,”,@F’ file.xls

• Extract specific column as text:


XLSperl -lane ‘print $F{B}’ file.xls

Excel on the Command Line perl.jonallen.info


XLSperl - more examples
• Convert UNIX password file to Excel format:
XLSperl -F: -lane ‘next if /^#/; XLSprint @F’
/etc/passwd >passwd.xls

• Concatenate multiple Excel files:


XLSperl -lane ‘XLSprint @F’ *.xls >new.xls

• Extract rows from an Excel file:


XLSperl -lane ‘XLSprint @F if($ROW <= 10)’
old.xls >new.xls

Excel on the Command Line perl.jonallen.info


Future enhancements
• Run as command interpreter (#!/usr/bin/XLSperl)
• Implement ‘-M’ option to load additional Perl
modules
• In-place editing of Excel files (use the ‘-i’ option),
that will keep formatting of original document

XLSperl -i.bak -pe ‘s/old/new/’ file.xls

• Any ideas?

Excel on the Command Line perl.jonallen.info


Fin!

Thank you for listening!

Source code is available from


http://perl.jonallen.info/projects/xlstools

Excel on the Command Line perl.jonallen.info

You might also like