iNZight and Sample Surveys
Chapter 1 Introduction
iNZight is a free software package based on the R language, but you don’t need to know any R to use it. It was developed at the University of Auckland.
There is also an online version of iNZight.
Note that the Mac version has got rather out of date. Mac users are encouraged to use the online version to get the full functionality of iNZight.
And here is a document from STAT193 which gives a good summary of some common operations in iNZight. Some of these operations are also covered in this manual.
1.1 iNZight and Sample Surveys
The most recent versions of iNZight (>=3.5) incorporate the proper treatment of sample survey data. The interface is exactly the same as for regular analyses of a dataset, except that the user first specifies a sample design to attach to the dataset.
This manual gives an introduction to iNZight for the purpose of using it with Sample Survey data.
The functions underlying the iNZight survey sampling functions are from the R
survey package, written by Thomas Lumley, and documented in the book Complex Surveys: A Guide to Analysis Using R, Thomas Lumley (Wiley, 2010) Lumley (2010). If you are familiar with R you can do everything iNZight can do directly with R code using the
1.2 Using Excel
iNZight isn’t designed to do all of the calculations you might need to do. It’s a good idea to be familiar with a spreadsheet program like Excel, particularly for cleaning up your data before loading it into iNZight.
Here is a document from STAT193 which gives a good summary of some common operations in Excel.
1.3 R Console
When iNZight starts, two windows open: one is the main window in which we work, and a second window is an
R Console. R code can be directly typed into this window at the prompt
>. The datasets that are imported into iNZight are available at the R console. However note that the R installation that comes with iNZight is different from the one you might have installed yourself with Rstudio, so R packages you may have added will not be available.
This manual does not cover the general writing of R code, however some simple calculations may be conveniently done at the R Console. There are brief details on the use of the R Console in Chapter 9.
Some of the calculations in this manual require simple use of the R Console - but all of them can also be done on a standard scientific calculator, or a graphics calculator.
1.4 Notation used this manual
Instructions such as
File > Preferences means click the
File menu and select the
1.5 Example Dataset- API
In this manual we make extensive use of the a dataset on Student performance in California schools. The dataset is distributed with the
survey library in R, and there are a number of unit record datasets which are drawn from a population dataset
apipop under various sample designs. The population units are the schools, and a number of characteristics are available. Full documentation can be found in the
survey package. The variables we use mostly are:
stype- School type (Elementary/Middle/High School)
snum- School number
dnum- District number
api99- Academic Performance Indicator for the school in the year 1999
api00- Academic Performance Indicator for the school in the year 2000
sch.wide- Did the school meet its school-wide growth target?
awards- Is the school eligible for an awards program
meals- Percentage of students eligible for subsidised meals
ell- Percentage of students that are English Language Learners
grad.sch- Percentage of students with parents with a postgraduate education
enroll- Number of students enrolled
api.stu- Number of students tested in the API
1.6 iNZight is still being developed
There are a few items of functionality that are under development in iNZight, and a couple of bugs to iron out too.
From version 3.5.3 (11 May 2020) onwards the software calculates Design Effects for estimates where possible. This version also repairs an earlier bug in which the replicate weight specification in Chapter 6 does not work for the Jacknife.
Version 3.5.3 also allows intercepts to be omitted from regression models, making it suitable for ratio estimation.
1.7 Important Sample Designs
Some of the most important sample designs covered in this manual.
Lumley, Thomas. 2010. Complex Surveys: A Guide to Analysis Using R. Hoboken, NJ: Wiley.