Jesse Whitehouse Developer | Analyst

13 vs 15 inch Laptop

Some things are harder to buy than others; especially for a compulsive researcher like me. I pour over reviews for hours because I’m convinced the perfect is choice out there and I can find it. It’s usually awesome and I feel very proud when that happens. But when it doesn’t I look kinda foolish. Like when I bought and returned four laptops this summer before settling on my final pick: a fifteen inch MacBook Pro with Touch Bar.

My Needs

This really wasn’t a complicated subject for me. My iMac breezes through the heavy computational lifting. My iPhone run YouTube, Safari, Twitter and Mail. But I wanted a development machine for the road (and let’s be honest, the couch). It needs to run Python 3, Sublime Text, and a couple linux virtual machines. Any Mac in the lineup can do this. So I just had to pick between the twelve, thirteen, and fifteen inch models.

The Road to Fifteen

The twelve and thirteen inch models share similar proportions. I ruled out the twelve inch model immediately because it lacked screen real estate. I still tried the thirteen inch for a couple weeks, though, because it’s higher DPI screen displays more windows at once. I went back and forth between the fifteen and thirteen inch models for a month before I eventually returned the thirteen.

In the end, it came down to comfort. Developing on the thirteen inch model was like removing a wet swimsuit in a telephone booth. I develop with Sublime on one half of the screen and Safari on the other. With Safari zoomed to readable size, however, I couldn’t show a full page-width in Safari. I tried zooming out and in, or resizing my Sublime window. But it never felt right.

There are a lot of discussions about the thirteen versus the fifteen inch computers across the web. This is my two cents: performance aside, if you work with multiple windows at the same time: buy the fifteen inch model.

Upgrading BasicTeX with Recommended Fonts

Setup

The other day I needed a way to convert a markdown document to PDF for my company. All signs pointed to Pandoc as the fastest and highest quality option. So I downloaded it and ran:

$whitehouse: pandoc training.md -o training.pdf

But immediately received an error:

pandoc: pdflatex not found. pdflatex is needed for pdf output.
Error: pandoc document conversion failed

Pandoc requires a LaTeX installation to generate PDF’s. For macOS, this usually means the enormous MacTex package. As of May 2017 it’s a 3GB download 1. The entire Microsoft Office Suite download is only 2.7GB, by comparison. MacTeX includes every major package needed to generate even the most complicated typsettings and page layouts. So it’s really overkill for my simple needs.

I found an alternative Mac installation called BasicTeX that’s only 72mb. It includes all the basic packages but nothing else. It includes tlmgr (the LaTeX package manager) though, so you can add components later.

brew cask install basictex

During installation BasicTeX adds /usr/local/texlive/current_version to your $PATH variable and conveniently stores all of its resources there. Close and reopen terminal.app and Pandoc should generate a beautiful PDF with aplomb.

The default settings look beautiful. It uses Latin Modern font with suitable line spacing. The margins are a touch bloated. But the whole setup will do in a pinch.

Add Fonts

Still, I don’t want my training documentation at work to look like the Journal of the American Medical Association. So I needed a different font. If I had installed MacTeX this wouldn’t be a problem because the mac daddy of downloads includes every font under the sun. But BasicTeX includes just two: Latin Modern and Times.

According to the Pandoc Documentation, the fastest way to add some useful fonts is with tlmgr. First, make sure it’s updated.

tlmgr update --self

Then have tlmgr install the collection-fontsrecommended package:

sudo tlmgr install collection-fontsrecommended

This package includes the thirty five base PostScript 2 fonts (more info here), the TeX Gyre font families, and encoding support for Computer Modern families. Detailed information about this package can be found at netbsd.

To use a different font, add YAML front matter to the top of your .md file before running Pandoc.

---
title: Your Title
author: Your Name
fontfamily: Your Font Choice
---

I chose Palatino.2

  1. And it’s getting bigger. In my search for solutions to this problem I found references to MacTeX from 2012 and earlier when “large” was 1.5GB. When you read this it will probably have grown some more. 

  2. You can see a Pandoc rendered version of this blog post here 

Understanding iter_rows in openpyxl

Let me say first that I love openpyxl. The more I work with it, the more I admire its power and ruthless elegance. But occasionally the documentation leaves me wanting.

Reading data with iter_rows

Let’s say I have a file called example.xlsx with some simple data, shown here.

  A B C D
1 First Second Third  
2 Fourth Fifth Sixth  
3 Seventh Eight Ninth  
4        

As with any two-dimensional array, the most obvious way to store this data in Python is a nested list. And thankfully, openpyxl makes this easy. To start, import the library and open the file.

from openpyxl import load_workbook

wb = load_workbook('_path_to_file')
ws = wb['sheet1']

We can access individual cell values easily with A1 notation like so.

>>> ws["A1"].value
First
>>> ws["C3"].value
Ninth

For this specific problem, we could make a nested list by manually typing each cell reference as a list element assignment.

myRange = [ [] ]
myRange[0][0] = ws["A1"].value
myRange[0][1] = ws["B1"].value
myRange[0][2] = ws["C1"].value

...

But this method is time-consuming and inflexible. If we want to reuse this code later we’d need to manually type width * height cell references by hand. Instead we’ll use the iter_rows method. It looks like this:

1
2
3
4
5
6
myRange = []
for row in ws.iter_rows(min_row=1, max_row=3, min_col=1, max_col=3):
    rowList = []
    for cell in row:
        rowList.append(cell.value)
    myRange.append(rowList)

Lines 1 & 3 we declare our target lists for cell values.

Line 2 begins a for loop that loops through the output of the iter_rows method. iter_rows returns a tuple of tuples. The parent tuple represents the entire range. The child tuples contain rows of cell objects. We select our desired cells using the max and min arguments. Think of these like (1,1) notation from Excel VBA or xlwings.

  • min_col = 1 translates to column A
  • max_col = 3 translates to column C

Line 4 begins a for loop to cycle through each cell object. We append each cell value to our nested list in line 5. Once the row has been fully iterated we append it to the parent list.

myRange now contains a nested list of the values in the range A1:C3.

>>> myRange
[['First', 'Second', 'Third'],
['Fourth', 'Fifth', 'Sixth'],
['Seventh', 'Eight', 'Ninth']]

If we need to rotate the data we could use the iter_cols method instead.

>>> myRange_columns
[['First', 'Fourth', 'Seventh'],
['Second', 'Fifth', 'Eight'],
['Third', 'Sixth', 'Ninth']]

Arguments optional

Neither method requires arguments. The default min value is 1. If no min is provided, the method begins looking for cell objects at A1. And neither method requires a max keyword. If no max is provided, the method continues sucking up cell objects until it reaches an empty cell. In the code sample above we can remove the arguments to iter_rows and would still receive the same results. Only use the max and min arguments if you must specifically exclude cells adjacent to your target range.

Reading ranges with A1 notation

If A1 notation makes sense for your application, keep in mind that iter_rows and iter_cols underpin the A1 notation system as well. Whether your call to ws[“range”] returns a column or row depends on the kind of range you call.

Range Type Example Method Return Type
Entire Column A:A iter_cols
Entire Row 1:1 iter_rows
Bounded Row A1:C3 iter_rows
Bounded Column A1:A3 iter_cols

Always remember that both methods return tuples. And if there are multiple rows or columns selected, the tuple will be nested.

Writing data with iter_rows

Let’s say we need to write our nested list back to example.xlsx after having converted each value from ordinal number format to the informal format. My list now looks like this:

[['1st', '2nd', '3rd'],
['4th', '5th', '6th'],
['7th', '8th', '9th']]

To write this back to the my Excel file in place, I’ll use the following.

1
2
3
4
5
6
7
8
9
10
11
myRange = [['1st', '2nd', '3rd'],
          ['4th', '5th', '6th'],
          ['7th', '8th', '9th']]

targetRange = ws.iter_rows()

for tRange, mRange in zip(targetRange, myRange):
    for cell, val in zip(tRange, mRange):
        cell.value = val

wb.save('example_update.xlsx')

In lines 1 - 3 I declare my nested list of values to be written. In a real application, this would probably be generated programmatically elsewhere.

In line 5 I set targetRange equal to a nested tuple of cell objects from my worksheet. As before, I don’t need arguments for iter_rows because I do not need to exclude any cells from my file write.

Line 7 says “Set my nested list and nested tuple beside each other (zip). Call the target range tRange and the source range mRange.”1

Line 8 says “One by one, take a cell object from the target range and set its contents equal to the value of the adjacent value in mRange (zip again).”

In line 11 I save my workbook as a new file called example_update.xlsx.

This approach is a lot different than how VBA or libraries like xlwings treat ranges. Those tools allow you to references cells with (1,1) notation. openpyxl doesn’t let you do this directly. A call to iter_rows returns a generator that makes tuples, instead of a tuple directly. As a result, you can’t slice it with square brackets unless you do some extra hacking. I’ve experimented witha few models but have yet to find one that works as efficiently with as few lines of code.

Despite the added mental gymnastics needed to grok how iter_rows works, it is a superior function. It economizes on memory use and it’s blazing fast. I wrote a process in xlwings a few weeks ago that transforms data from one format to another. When I rewrote it in openpyxl the script run time decreased by 2500%.

  1. “When I learned this process I found it best to translate these nested for loops into plain English. So I’ll do that here.” 

Import Methods in Python

What’s the difference between import, import from, and import * in Python1?

import module

Assume that function do_something() is defined in myModule. You import myModule into main.py with the statement: import myModule. To call the function you would use the expression myModule.do_something().

You can use the as modifier in your import statement to enhance readability. For example, import myModule as mod would let you call the function with mod.do_something().

from module import

Assume that function do_something() is defined in myModule. You can import just one function with the statement from myModule import do_something(). To call the function, you do not need to prefix the function call with a .

Instead, you call the function normally with the expression do_something().

You can use the as modifier here too. from myModule import do_something() as do lets you call do_something(_args_) with the expression do(_args_).

import *

Star import behaves like import from. Functions imported this way do not require a prefix when called. But this method imports all the functions in the selected module. If myModule contains the functions do_something(), add_something(), and make_something() you can call these functions without a prefix.

Star imports are convenient if you interact with Python in the shell. It’s cumbersome to import specific functions with import from or to type the module prefix of each function with import module. However, you should avoid star imports in your own modules. A complicated algorithm becomes difficult to debug and maintain if functions like do_something() are defined elsewhere in the file structure. This becomes more important if you import many modules.

For example, if the top of your module looks this:

from module1 import *
from module2 import *
from module3 import *
from module4 import *

and you call do_something() in the script body, you have to search through the imported modules to find the function definition. Star imports are hostile to the person reading your code, whether that’s a team member on your project or you in six months.

  1. See this useful response on Stack Overflow or the Python Docs for more information. 

Faster

I first discovered podcasts last year when, on a whim, I started listening to The Talk Show with John Gruber. It’s a funny, informative and very casual conversation about technology that comes out semi-weekly - and it’s still one of my favorites. A few weeks later, I started listening to ATP and eventually graduated to staples like This American Life, RadioLab, and of course Serial. I even started making my own podcast, called “Soul What?” – which is still in production but undergoing some substantial changes.

As much as I listened last year, though, I desperately wanted to hear more. So I made an audacious goal this year to consume a thousand hours of podcast material. It’s almost April and I’m well on my way, just shy of 200 hours. And although I’ve listened to hundreds of episodes I haven’t spent exactly 200 hours of my time with headphones in.

Instead, I use the 1.5x in my podcast player (Overcast) to jump through those episodes a little quicker. This has literally saved me hours of time. And in case you’re dubious, here’s a sample:



That’s audio sped up to 1.5x. It’s just as intelligible as normal speed to my ear, but only 23 seconds where the original is 32.

Here’s the original for comparison:



It did take some time for my ears to grow totally comfortable with the faster pace, and this tool sounds lousy if your podcasts include any background music at all. But if you enjoy simple, spoken-word podcasts like me, I highly recommend it.