Goodbye static graphs, hello shiny, ggvis, rmarkdown (vs JS solutions)

Monday August 18, 2014

Filed under: computing,R,statistics,type — jackman @ 6:00 am

One of the very exciting and promising developments from RStudio is the rmarkdown/shiny/ggvis combination of tools.

We’re on the verge of static graphs and presentations being as old-fashioned as overhead transparencies.

I’ve spent the last couple of days giving these tools a test spin. Lots of comments and links to examples appear below.

I came to this investigation with a specific question in mind: how can I get a good-looking scatterplot with some rollover/tooltip functionality into a presentation, with one tool or one workflow?

Soft constraints: I’d prefer to use R, at least on the data side, and I would also like customization over look and feel (e.g, slide transitions), stylistic elements like type, color, sizes and spacing.

I use either Beamer or Keynote for presentations (Beamer for teaching/stats-type talks, Keynote for more substantive, general audience talks).   I began by investigating how one might drop a d3-rendered graph into a Keynote presentation, but this seems pretty hard.   Hacking at the files produced by Keynote’s export-to-HTML function seems formidable.

I’ve also been poking at solutions that are all on the JS side of the ledger (e.g., d3 + stack), inspired by this example from Karl Broman. I’m also interested in how one might roll an interactive graphic into Prezi.

But back to the RStudio workflow, using the rmarkdown/shiny/ggvis combination.  Here is some sample output I’ve created: a standalone scatterplot and a dummy presentation.

Some observations:

If you’re happy with the out-of-the-box style defaults, then this stack of tools is just about there and evolving rapidly. And keep in mind that rmarkdown does a lot more than make presentations. For instance, I’m yet to really explore rmarkdown for producing publish-to-web papers.

If you crave fine control over layout and graphical elements, then I think it might still be a d3/js world, at least for a while longer.

I’m still left thinking that if I could drop shiny apps or d3 into Keynote (somehow), then I’d have the best of both worlds.

Comments (15)

ideal point graphics, via d3

Wednesday July 30, 2014

Filed under: R,statistics — jackman @ 1:04 pm

I’ve updated some of the graphical displays of the ideal point estimates I serve up here. I’ve rendered some of these in d3, with some rollover lah-de-dah: (1) 113th House ideal points in a long “caterpillar” format; (2) scatterplot of ideal point against Obama 2012 vote in district. Screenshot of the scatterplot appears below.

My R scripts dump csv containing the ideal point estimates, credible intervals, labeling info, which I then pick up on the d3 side.   Separate files dump fitted values from local regression fitting estimated ideal point as a function of Obama vote in district.

I toyed with the idea of loess on the d3/js side (with sliders for user control of bandwidth etc), more as a plausibility probe than anything, but it seems like a lot to push down through the browser.

Screen Shot 2014-07-30 at 12.59.38 PM

Comments (9)

my 1st post for the Guardian Australia

Thursday May 30, 2013

Filed under: Australian Politics,R,statistics — jackman @ 1:04 pm

I’ll be contributing a piece about once a week for the Guardian Australia, under a part of the web site we’re calling The Swing.

The set of graphs from my 1st effort were rendered in-line and rather low-res.

Bigger, full res versions appear below; click on the in-line versions.



It would be great to find a way to quickly make nice, web-friendly graphs out of R. Vega looks like a reasonable wrapper to d3. Datawrapper.de just doesn’t give me enough control over annotations, axes etc… I’m also looking at Rickshaw. Life is short, beautiful graphics are hard, sometimes…

Comments (8)

CRAN might get tenure at Yale?

Friday August 24, 2012

Filed under: R,statistics — jackman @ 7:00 am

From one of the R lists I follow:

Today (2012-08-23) on CRAN [1]:

“Currently, the CRAN package repository features 4001 available packages.”

These packages are maintained by approximately 2350 different folks.

Previous milestones:

2011-05-12: 3,000 packages [1]
2009-10-04: 2,000 packages [2]
2007-04-12: 1,000 packages [3]
2004-10-01: 500 packages [4]
2003-04-01: 250 packages [4]

[1] http://cran.r-project.org/web/packages/
[2] https://stat.ethz.ch/pipermail/r-devel/2009-October/055049.html
[3] https://stat.ethz.ch/pipermail/r-devel/2007-April/045359.html
[4] My private in-house data.
[5] http://cran.r-project.org/web/checks/check_summary_by_maintainer.html


PS. This count includes only packages on CRAN. There are more
packages elsewhere.

Comments Off

Rudd, the last one standing?: Federal implications of QLD state election results

Wednesday April 4, 2012

Filed under: Australian Politics,R,statistics — jackman @ 12:40 am

Labor won 15 of Queensland’s 29 House of Reps seats in the 2007 Federal election (AEC details here). Yet just three years later, in the 2010 Federal election, Labor won only 8 of 30 Queensland Reps seats, with 33.6% of 1st preferences (a swing of -9.3 percentage points).

Labor’s best performance on 1st preferences in 2010 was in Capricornia (46%), which translated into a 54-46 2PP result. Kevin Rudd won Griffith with 44% of 1st preferences, resulting in a 58-42 2PP result. Wayne Swan and the LNP candidate split the 1st preferences in Lilley, 41-41, with Swan winning the seat with Green preferences, 53-47 2PP. Labor managed to get home in Moreton in 2010, with 36% of the 1st preference vote, and a 51-49 2PP result.

The state election of some 10 days ago was conducted under different district boundaries (89 seats in the Queensland parliament) and a different electoral system (optional preferential). Moreover, the Katter Australia Party ran candidates in 76 seats, winning 11.5% of 1st preferences, further complicating comparisons with previous elections (state or federal). In any event, Labor won about 26.7% of 1st preferences (ECQ results), down 6.9 percentage points from its performance in the 2010 Federal election, and down a staggering 15.6 percentage points from the 2009 state election.

How might these 2012 state-level results translate into Federal results?

There are many different ways of looking at this, all of which involve a little guesswork and assumptions given the differences in the two electoral systems, the configuration of parties and so on.

Here’s a stab that I’ve been working on over the last week or so (“Spring Break” here at Stanford). The AEC conveniently (!) geo-codes its polling places and publishes that data on its web site. Shape files for Federal electorates are also available. This makes it feasible to start re-aggregating booth-level results from the state election up to Federal seats.

A few steps and assumptions are required (and I’ll write this up at some point):

So what do you get when do this re-aggregation, subject to all the caveats sounded above? Keep in mind I only have 1st preferences, at least for now.

The figure below (click for full-size) shows a scatterplot of imputed Federal results for the ALP given the 2012 state results, for each of Queensland’s 30 Federal seats, against the ALP’s actual 1st preference vote share (%) recorded in the 2010 Federal election. The diagonal line is a 45 degree line, a “no difference” line. On average, the data points lie below the diagonal, indicating what we know, that Labor did considerably better in the 2010 Federal election than in the 2012 state election.

Red dots and labels indicate the 8 seats won by Labor in 2010. The good news (!?) for Labor is that the Federal seats in which its primary vote utterly cratered are seats in which it had no chance of winning in the 1st place, where its 2010 1st preference vote share was below 30% or barely above 30% (e.g., Wide Bay, Maranoa, Fairfax, Wright, Fisher, Hinkler).

The bad news for Labor is that it would seem that most of its 8 Federal, Queensland seats are at some peril, with the exceptions perhaps being Griffith (Rudd’s seat), and maybe Rankin (Craig Emerson) and Oxley. The estimated ALP 1st preference vote share given the 2012 state results in these 3 seats lies above the actual ALP 1st preference recorded in Moreton in 2010, which was Labor’s weakest among the 8 seats it won in 2010 (and observe the many assumptions implied in that extrapolation).

Lilley — Swan’s seat — will be interesting. I grew up in Lilley on Brisbane’s northside. When Labor is really on the nose, it goes to the Coalition. Swan lost the seat in 1996 in his sophomore election, but has held it since 1998. I’m not sure the last redistribution helped, and its tough to see Labor win it if its primary vote share slips below 35%. Complicating factors are what role might the Katter party play, as well as some kind of “personal vote” for Swan (an incumbent Federal Treasurer, no less).

I also show the implied swings given by these estimates of ALP 1st preference vote share (bigger version available by clicking):

This presentation of the data highlights that Griffith (Rudd’s seat) has the smallest implied swing among Labor’s 8 seats, around about 5 percentage points. Coupled with the fact that Rudd starts off at a tolerable level of 1st preference support, this bolsters confidence that Griffith remains Labor’s best shot at a “retain” in 2013.

The implied swing in Moreton is only a little larger, but there is far less buffer there. Swings of -7 to -8 percentage points on 1st preferences in Lilley, Rankin and Oxley would have to be almost surely fatal to Labor’s chances there. And double digit swings in Petrie, Blair and Capricornia would also have be beyond the margin of survival.

Could Rudd be the last (QLD, Labor) one standing?

Comments Off

NYTimes wants a “maker”

Tuesday January 3, 2012

Filed under: computing,statistics — jackman @ 3:20 pm

Via @flowingdata, the NYTimes is looking for a “maker“.

This job title and the description reminds me that “making” cool deliverables is just like…

Which then might imply that this other NYTimes “data scientist” position is more like being an “Oompa Loompa”?

Comments (1)

Apple now speaks Australian?

Wednesday November 2, 2011

Filed under: Apple,computing — jackman @ 4:33 pm

Too funny. Apple just released iOS 5.0.1 beta to developers. One of the listed improvements and bug fixes is

improves voice recognition for Australian users using dictation

Comments (1)

pscl 1.04 live on CRAN

Saturday October 15, 2011

Filed under: computing,R,statistics — jackman @ 5:01 pm

Update to my pscl package, now on CRAN.

Biggest change: fixing a bug in the way MCMC draws for item parameters were being stored and summarized by ideal.

Comments Off

Bay Area R Users group has 1300 members

Wednesday October 12, 2011

Filed under: R,statistics — jackman @ 3:03 pm


You are not alone!

Comments Off

Brad Efron meets Clarify?

Monday October 3, 2011

Filed under: computing,statistics — jackman @ 11:36 am

This might be an interesting seminar. I wonder if Brad knows about the long history of “poor-person’s” (approximately-asymptotically valid) Bayesian inference in political science via things like Clarify?

Tuesday, October 4, 4:15pm: Statistics Seminar, Sequoia Hall Room 200
Brad Efron, Stanford University Statistics
“Bayesian inference and the parametric bootstrap”

The parametric bootstrap can often be used to compute posterior distributions obtained from complicated Bayesian models. Besides its computational advantages, the bootstrap o ffers insight on the relationship between Bayesian and frequentist methods. I will discuss some examples relating to exponential families, generalized linear models, and high-dimensional inference.

Comments Off
Next Page »

Powered by WordPress

Bad Behavior has blocked 2419 access attempts in the last 7 days.