# iNZight and Sample Surveys

*2020-05-15*

# Chapter 1 Introduction

iNZight is a free software package based on the R language, but you don’t need to know any R to use it. It was developed at the University of Auckland.

You can get the software for Windows, Mac or Linux from the iNZight webpage. The page also contains links to some user guides.

There is also an online version of iNZight.

Note that **the Mac version has got rather out of date**. Mac users are encouraged to use the online version to get the full functionality of iNZight.

And here is a document from STAT193 which gives a good summary of some common operations in iNZight. Some of these operations are also covered in this manual.

## 1.1 iNZight and Sample Surveys

The most recent versions of iNZight (>=3.5) incorporate the proper treatment of sample survey data. The interface is exactly the same as for regular analyses of a dataset, except that the user first specifies a sample design to attach to the dataset.

This manual gives an introduction to iNZight for the purpose of using it with Sample Survey data.

The functions underlying the iNZight survey sampling functions are from the R `survey`

package, written by Thomas Lumley, and documented in the book *Complex Surveys: A Guide to Analysis Using R*, Thomas Lumley (Wiley, 2010) Lumley (2010). If you are familiar with R you can do everything iNZight can do directly with R code using the `survey`

package.

## 1.2 Using Excel

iNZight isn’t designed to do all of the calculations you might need to do. It’s a good idea to be familiar with a spreadsheet program like Excel, particularly for cleaning up your data before loading it into iNZight.

Here is a document from STAT193 which gives a good summary of some common operations in Excel.

## 1.3 R Console

When iNZight starts, two windows open: one is the main window in which we work, and a second window is an `R Console`

. R code can be directly typed into this window at the prompt `>`

. The datasets that are imported into iNZight are available at the R console. However note that the R installation that comes with iNZight is different from the one you might have installed yourself with Rstudio, so R packages you may have added will not be available.

This manual does not cover the general writing of R code, however some simple calculations may be conveniently done at the R Console. There are brief details on the use of the R Console in Chapter 9.

Some of the calculations in this manual require simple use of the R Console - but all of them can also be done on a standard scientific calculator, or a graphics calculator.

## 1.4 Notation used this manual

Instructions such as `File > Preferences`

means click the `File`

menu and select the `Preferences`

option.

## 1.5 Example Dataset- API

In this manual we make extensive use of the a dataset on Student performance in California schools. The dataset is distributed with the `survey`

library in R, and there are a number of unit record datasets which are drawn from a population dataset `apipop`

under various sample designs. The population units are the schools, and a number of characteristics are available. Full documentation can be found in the `survey`

package. The variables we use mostly are:

`stype`

- School type (Elementary/Middle/High School)`snum`

- School number`dnum`

- District number`api99`

- Academic Performance Indicator for the school in the year 1999`api00`

- Academic Performance Indicator for the school in the year 2000`sch.wide`

- Did the school meet its school-wide growth target?`awards`

- Is the school eligible for an awards program`meals`

- Percentage of students eligible for subsidised meals`ell`

- Percentage of students that are English Language Learners`grad.sch`

- Percentage of students with parents with a postgraduate education`enroll`

- Number of students enrolled`api.stu`

- Number of students tested in the API

## 1.6 iNZight is still being developed

There are a few items of functionality that are under development in iNZight, and a couple of bugs to iron out too.

From version 3.5.3 (11 May 2020) onwards the software calculates Design Effects for estimates where possible. This version also repairs an earlier bug in which the replicate weight specification in Chapter 6 does not work for the Jacknife.

Version 3.5.3 also allows intercepts to be omitted from regression models, making it suitable for ratio estimation.

## 1.7 Important Sample Designs

Some of the most important sample designs covered in this manual.

### References

Lumley, Thomas. 2010. *Complex Surveys: A Guide to Analysis Using R*. Hoboken, NJ: Wiley.