my 1st post for the Guardian Australia

Thursday May 30, 2013

Filed under: Australian Politics,R,statistics — jackman @ 1:04 pm

I’ll be contributing a piece about once a week for the Guardian Australia, under a part of the web site we’re calling The Swing.

The set of graphs from my 1st effort were rendered in-line and rather low-res.

Bigger, full res versions appear below; click on the in-line versions.



It would be great to find a way to quickly make nice, web-friendly graphs out of R. Vega looks like a reasonable wrapper to d3. Datawrapper.de just doesn’t give me enough control over annotations, axes etc… I’m also looking at Rickshaw. Life is short, beautiful graphics are hard, sometimes…

Comments (8)

CRAN might get tenure at Yale?

Friday August 24, 2012

Filed under: R,statistics — jackman @ 7:00 am

From one of the R lists I follow:

Today (2012-08-23) on CRAN [1]:

“Currently, the CRAN package repository features 4001 available packages.”

These packages are maintained by approximately 2350 different folks.

Previous milestones:

2011-05-12: 3,000 packages [1]
2009-10-04: 2,000 packages [2]
2007-04-12: 1,000 packages [3]
2004-10-01: 500 packages [4]
2003-04-01: 250 packages [4]

[1] http://cran.r-project.org/web/packages/
[2] https://stat.ethz.ch/pipermail/r-devel/2009-October/055049.html
[3] https://stat.ethz.ch/pipermail/r-devel/2007-April/045359.html
[4] My private in-house data.
[5] http://cran.r-project.org/web/checks/check_summary_by_maintainer.html


PS. This count includes only packages on CRAN. There are more
packages elsewhere.

Comments Off

Rudd, the last one standing?: Federal implications of QLD state election results

Wednesday April 4, 2012

Filed under: Australian Politics,R,statistics — jackman @ 12:40 am

Labor won 15 of Queensland’s 29 House of Reps seats in the 2007 Federal election (AEC details here). Yet just three years later, in the 2010 Federal election, Labor won only 8 of 30 Queensland Reps seats, with 33.6% of 1st preferences (a swing of -9.3 percentage points).

Labor’s best performance on 1st preferences in 2010 was in Capricornia (46%), which translated into a 54-46 2PP result. Kevin Rudd won Griffith with 44% of 1st preferences, resulting in a 58-42 2PP result. Wayne Swan and the LNP candidate split the 1st preferences in Lilley, 41-41, with Swan winning the seat with Green preferences, 53-47 2PP. Labor managed to get home in Moreton in 2010, with 36% of the 1st preference vote, and a 51-49 2PP result.

The state election of some 10 days ago was conducted under different district boundaries (89 seats in the Queensland parliament) and a different electoral system (optional preferential). Moreover, the Katter Australia Party ran candidates in 76 seats, winning 11.5% of 1st preferences, further complicating comparisons with previous elections (state or federal). In any event, Labor won about 26.7% of 1st preferences (ECQ results), down 6.9 percentage points from its performance in the 2010 Federal election, and down a staggering 15.6 percentage points from the 2009 state election.

How might these 2012 state-level results translate into Federal results?

There are many different ways of looking at this, all of which involve a little guesswork and assumptions given the differences in the two electoral systems, the configuration of parties and so on.

Here’s a stab that I’ve been working on over the last week or so (“Spring Break” here at Stanford). The AEC conveniently (!) geo-codes its polling places and publishes that data on its web site. Shape files for Federal electorates are also available. This makes it feasible to start re-aggregating booth-level results from the state election up to Federal seats.

A few steps and assumptions are required (and I’ll write this up at some point):

So what do you get when do this re-aggregation, subject to all the caveats sounded above? Keep in mind I only have 1st preferences, at least for now.

The figure below (click for full-size) shows a scatterplot of imputed Federal results for the ALP given the 2012 state results, for each of Queensland’s 30 Federal seats, against the ALP’s actual 1st preference vote share (%) recorded in the 2010 Federal election. The diagonal line is a 45 degree line, a “no difference” line. On average, the data points lie below the diagonal, indicating what we know, that Labor did considerably better in the 2010 Federal election than in the 2012 state election.

Red dots and labels indicate the 8 seats won by Labor in 2010. The good news (!?) for Labor is that the Federal seats in which its primary vote utterly cratered are seats in which it had no chance of winning in the 1st place, where its 2010 1st preference vote share was below 30% or barely above 30% (e.g., Wide Bay, Maranoa, Fairfax, Wright, Fisher, Hinkler).

The bad news for Labor is that it would seem that most of its 8 Federal, Queensland seats are at some peril, with the exceptions perhaps being Griffith (Rudd’s seat), and maybe Rankin (Craig Emerson) and Oxley. The estimated ALP 1st preference vote share given the 2012 state results in these 3 seats lies above the actual ALP 1st preference recorded in Moreton in 2010, which was Labor’s weakest among the 8 seats it won in 2010 (and observe the many assumptions implied in that extrapolation).

Lilley — Swan’s seat — will be interesting. I grew up in Lilley on Brisbane’s northside. When Labor is really on the nose, it goes to the Coalition. Swan lost the seat in 1996 in his sophomore election, but has held it since 1998. I’m not sure the last redistribution helped, and its tough to see Labor win it if its primary vote share slips below 35%. Complicating factors are what role might the Katter party play, as well as some kind of “personal vote” for Swan (an incumbent Federal Treasurer, no less).

I also show the implied swings given by these estimates of ALP 1st preference vote share (bigger version available by clicking):

This presentation of the data highlights that Griffith (Rudd’s seat) has the smallest implied swing among Labor’s 8 seats, around about 5 percentage points. Coupled with the fact that Rudd starts off at a tolerable level of 1st preference support, this bolsters confidence that Griffith remains Labor’s best shot at a “retain” in 2013.

The implied swing in Moreton is only a little larger, but there is far less buffer there. Swings of -7 to -8 percentage points on 1st preferences in Lilley, Rankin and Oxley would have to be almost surely fatal to Labor’s chances there. And double digit swings in Petrie, Blair and Capricornia would also have be beyond the margin of survival.

Could Rudd be the last (QLD, Labor) one standing?

Comments Off

NYTimes wants a “maker”

Tuesday January 3, 2012

Filed under: computing,statistics — jackman @ 3:20 pm

Via @flowingdata, the NYTimes is looking for a “maker“.

This job title and the description reminds me that “making” cool deliverables is just like…

Which then might imply that this other NYTimes “data scientist” position is more like being an “Oompa Loompa”?

Comments (1)

Apple now speaks Australian?

Wednesday November 2, 2011

Filed under: Apple,computing — jackman @ 4:33 pm

Too funny. Apple just released iOS 5.0.1 beta to developers. One of the listed improvements and bug fixes is

improves voice recognition for Australian users using dictation

Comments (1)

pscl 1.04 live on CRAN

Saturday October 15, 2011

Filed under: computing,R,statistics — jackman @ 5:01 pm

Update to my pscl package, now on CRAN.

Biggest change: fixing a bug in the way MCMC draws for item parameters were being stored and summarized by ideal.

Comments Off

Bay Area R Users group has 1300 members

Wednesday October 12, 2011

Filed under: R,statistics — jackman @ 3:03 pm


You are not alone!

Comments Off

Brad Efron meets Clarify?

Monday October 3, 2011

Filed under: computing,statistics — jackman @ 11:36 am

This might be an interesting seminar. I wonder if Brad knows about the long history of “poor-person’s” (approximately-asymptotically valid) Bayesian inference in political science via things like Clarify?

Tuesday, October 4, 4:15pm: Statistics Seminar, Sequoia Hall Room 200
Brad Efron, Stanford University Statistics
“Bayesian inference and the parametric bootstrap”

The parametric bootstrap can often be used to compute posterior distributions obtained from complicated Bayesian models. Besides its computational advantages, the bootstrap o ffers insight on the relationship between Bayesian and frequentist methods. I will discuss some examples relating to exponential families, generalized linear models, and high-dimensional inference.

Comments Off

Lion Update Went Bad

Friday July 22, 2011

Filed under: computing — jackman @ 7:42 am

I’m on the road in Sydney (still) and tried the Lion Mac OS update as soon as it came out (downloaded the upgrade overnight here in the hotel).

Word to the wise: run disk utilities and “repair” your disk before you start the upgrade. Also, don’t do the upgrade on the road away from your boot media.

Here is what happened to me. The install appeared to go well. Then I got into something of an infinite loop. The machine wanted to restart to complete the install. But then it couldn’t find my hard drive. It seems my hard drive had some corruption in the directory structure (either extant or introduced by the upgrade process, I can’t be sure), such that the drive became inaccessible, other than the install partition created by the Lion install app. Lion couldn’t finish the install of itself, because it couldn’t see the drive, or reported the drive as “locked”. Disk Utilities (as visible from the installer startup) couldn’t repair the issue. Much rebooting in various modes (option key, C key, Command-V, Command-S) didn’t resolve the issue. HD no longer visible, or mountable only in read-only mode, certainly couldn’t find it and boot from it.

Hmm…. This behavior had been reported by some folks on various Apple discussion lists. I wound up going to a Mac vendor here in Sydney (Next Byte), buying Disk Warrior for an outrageous sum of money.
No way in hell would my laptop boot from the Disk Warrior CD — I suspect my SuperDrive is broken too (time for an external SuperDrive and a new MacBook Air, btw). Things are not looking good.

Found a Mac desktop at the University with a firewire cable. Boot up laptop in target mode (T key at power up). Launch Disk Warrior on the desktop, point it at my laptop’s HD, repaired the directory structure etc. This got my Mac back to a state where it could see the drive, although now I have an “unjournaled” HFS+ drive. This caused the Lion installer to crap out, but at least I can boot in Snow Leopard, from whence I type now. Phwew.

No data lost. And I run an over-the-air backup service via Stanford in any event, so everything was safe in any event.

But it was a crappy two days or so. I’m jome on Saturday, where I will run a full backup etc, create a bootable image on a DVD, and try the upgrade again.

Comments Off

tracking Australian election betting markets again (now with sparklines)

Wednesday July 13, 2011

Filed under: Australian Politics,R — jackman @ 3:23 pm

The header of my blog (above) shows the latest prices on offer in some of Australia’s election betting markets.  I convert the prices to an implied probability of ALP win (factoring out the bookie’s profit margin, the so-called “overround”).

I’m using some Javascript by John Resig to make Tufte-ish sparklines, although the Google version of sparklines looks easy to work with too. I’m using some R to generate PNG files plotting the last 72 hours of data.

Time-series graphs appear as PDFs too, again see the header of the blog.

On the data themselves, the betting markets have been moving in a pro-Coalition direction over the last two weeks, with some movement around the time that recent polls have been released, showing that the Coalition would romp home.  I think we’re still waiting on some post-carbon-tax polling, and how the betting markets digest that.

Comments Off
Next Page »

Powered by WordPress

Bad Behavior has blocked 4944 access attempts in the last 7 days.