BigBrotherBot v1.8.0
System Development Information for the BigBrotherBot project.

b3::lib::statlib::anova Namespace Reference

Functions

def aanova
def Dfull_model
def Drestrict_mean
def Drestrict_source
def multivar_SScalc
 Save it so you don't have to calculate it again next time.
def subtr_cellmeans
def F_value_wilks_lambda
def member
def setsize
def subset
def propersubset
def numlevels
def numbitson
def makebin
def makelist
def round4

Variables

list alluniqueslist = [0]
 Create a list of all unique values in each column, and a list of these Ns.
list Nlevels = [0]
tuple Ncells = N.multiply.reduce(Nlevels[1:])
tuple Nfactors = len(Nlevels[1:])
int Nallsources = 2
tuple Nsubjects = len(alluniqueslist[0])
tuple Bwithins = findwithin(data)
 Within-subj factors defined as those where there are fewer subj than scores in the first level of a factor (quick and dirty; findwithin() below)
tuple Bbetweens = ~Bwithins&(Nallsources-1)
tuple Wcolumns = makelist(Bwithins,Nfactors+1)
list Wscols = [0]
tuple Bscols = makelist(Bbetweens+1,Nfactors+1)
tuple Nwifactors = len(Wscols)
tuple Nwlevels = N.take(N.array(Nlevels),Wscols)
tuple Nbtwfactors = len(Bscols)
tuple Nblevels = N.take(N.array(Nlevels),Bscols)
int Nwsources = 2
 Nbsources = Nallsources-Nwsources
tuple M = pstat.collapse(data,Bscols,-1,None,None,mean)
tuple Marray = N.zeros(Nblevels[1:],'f')
tuple Narray = N.zeros(Nblevels[1:],'f')
list idx = []
list coefflist
int dindex = 0
list NDs = [0]
tuple cdata = pstat.collapse(data,range(Nfactors+1),-1,None,None,mean)
int dummyval = 1
tuple datavals = pstat.colex(data,-1)
tuple DA = N.ones(Nlevels,'f')
tuple subjslots = N.ones((Nsubjects,1))
list new = alluniqueslist[j]
tuple btwidx = N.take(idx,N.array(Bscols))
int dcount = 1
list Bwsources = []
list Bwonly_sources = []
tuple D = N.zeros(Nwsources,N.PyObject)
list DM = [0]
list DN = [0]
float dwsc = 1.0
tuple Bnonsource = (Nallsources-1)
tuple Bwscols = makebin(Wscols)
 Bwithinnonsource = Bnonsource&Bwscols
tuple Lwithinnonsource = makelist(Bwithinnonsource,Nfactors+1)
 mns = dwsc
 Bwithinsource = source&Bwscols
tuple Lwithinsourcecol = makelist(Bwithinsource, Nfactors+1)
tuple Lsourceandbtws = makelist(source | Bbetweens, Nfactors+1)
tuple dvarshape = N.array(N.take(mns.shape,Lwithinsourcecol[1:]))
tuple idxarray = N.indices(dvarshape)
tuple newshape
tuple indxlist = N.swapaxes(N.reshape(idxarray,newshape),0,1)
tuple coeffmatrix = N.ones(mns.shape,N.Float)
tuple Wsourcecol = makelist(Bwscols&source,Nfactors+1)
list nlevels = coeffmatrix.shape[0]
list nextcoeff = coefflist[nlevels-1]
 scratch = coeffmatrix*mns
list tmp = D[dcount]
list variables = D[dcount]
tuple tidx = range(1,len(subjslots.shape))
tuple tsubjslots = N.transpose(subjslots,tidx)
tuple DMarray
tuple DNarray
tuple loopcap = N.array(tsubjslots.shape[0:-1])
tuple thismean
tuple BNs = pstat.colex([Nlevels],Bscols[1:])
tuple sourcecols = makelist(source-1,Nfactors+1)
tuple Lsource = makelist((Nallsources-1)&Bbetweens,Nfactors+1)
 HERE BEGINS THE MAXWELL & DELANEY APPROACH TO CALCULATING SS.
tuple btwcols = map(Bscols.index,Lsource)
tuple hn = aharmonicmean(Narray,-1)
float SSw = 0.0
tuple idxlist = pstat.unique(pstat.colex(M,btwcols))
list newval = row[-1]
tuple btwsourcecols = (N.array(map(Bscols.index,Lsource))-1)
 Bbtwnonsourcedims = ~source&Bbetweens
tuple Lbtwnonsourcedims = makelist(Bbtwnonsourcedims,Nfactors+1)
tuple btwnonsourcedims = (N.array(map(Bscols.index,Lbtwnonsourcedims))-1)
tuple sourceMarray = amean(Marray,btwnonsourcedims,1)
 Average Marray over non-source dimensions (1=keep squashed dims)
tuple sourceNarray = aharmonicmean(Narray,btwnonsourcedims,1)
 Calculate harmonic means for each level in source.
tuple ga
 Calc grand average (ga), used for ALL effects.
float sub_effects = 1.0
 Calc all SUBSOURCES to be subtracted from sourceMarray (M&D p.320)
 effect = sourceMarray-sub_effects
 Calc this effect (a(j)'s, b(k)'s, ab(j,k)'s, whatever)
tuple SS
 Save it so you don't have to calculate it again next time.
tuple collapsed = pstat.collapse(M,btwcols,-1,None,len,mean)
 Save it so you don't have to calculate it again next time.
tuple contrastmns = pstat.collapse(collapsed,btwsourcecols,-2,sterr,len,mean)
tuple contrastns
tuple contrasthns
tuple sourceNs = pstat.colex([Nlevels],makelist(source-1,Nfactors+1))
tuple dfnum = N.multiply.reduce(N.ravel(N.array(sourceNs)-1))
tuple dfden = Nsubjects-N.multiply.reduce(N.ravel(BNs))
 MS = SS/dfnum
 MSw = SSw/dfden
 f = MS/MSw
tuple prob = fprob(dfnum, dfden, f)
tuple sourcewithins = (source-1)
 RESTRICT COLUMNS/DIMENSIONS SPECIFIED IN source (BINARY) (i.e., is the value of source not equal to 0 or -1?)
list workD = D[Bwonly_sources.index(sourcewithins)]
 Set up workD and subjslots for upcoming calcs.
tuple ef = Dfull_model(workD,subjslots)
 Calculate full-model sums of squares.
tuple er = Drestrict_mean(workD,subjslots)
list p = workD.shape[1]
tuple k = N.multiply.reduce(N.ravel(BNs))
int m = 1
tuple d_en = float(p**2 + (k-1)**2 - 5)
float s = 1.0
tuple lmbda = LA.determinant(ef)
tuple W = math.pow(lmbda,(1.0/s))
string suffix = ''
tuple adjsourcecols = N.array(makelist(source-1,Nfactors+1))
string thiseffect = ''
tuple outputlist
tuple prefixcols = range(len(collapsed[0][:-3]))
tuple outlist = pstat.colex(collapsed,prefixcols)
list eff = []
list title = [['FACTORS: ','RANDOM'] + effects[:Nfactors]]
 OUTPUT FINAL RESULTS (ALL SOURCES TOGETHER) Note: All 3 types of source-calcs fall through to here.
list facttypes = ['BETWEEN']
tuple sourcebetweens = (source-1)
tuple all_cellmeans = N.transpose(DM[dindex],[-1]+range(0,len(DM[dindex].shape)-1))
tuple all_cellns = N.transpose(DN[dindex],[-1]+range(0,len(DN[dindex].shape)-1))
list levels = D[dindex]
 DO SS CALCS ON THE OUTPUT FROM THE SOURCE=0 AND SOURCE=-1 CASES.
tuple SSm = N.zeros((levels,levels),'f')
tuple tworkd = N.transpose(D[dindex])
tuple RSw = N.zeros((levels,levels),'f')
 Calculate SSw, within-subj variance (Lindman approach)
tuple RSinter = N.zeros((levels,levels),N.PyObject)
list cross = all_cellmeans[i]
tuple multfirst = asum(cross*all_cellns[i])
list sourceDMarray = DM[dindex]
 Average Marray over non-source dimensions.
tuple sourceDNarray = aharmonicmean(DN[dindex],btwnonsourcedims,1)
 Calculate harmonic means for each level in source.
tuple variableNs
 Calc grand average (ga), used for ALL effects.
tuple subsourcebtw = (subsource-1)
 Make a list of the non-subsource dimensions.
tuple sserr = N.zeros((levels,levels),'f')
tuple ssval = N.add.reduce(workd[i]*workd[j])
list mask = tsubjslots[idx]
 WHILE STILL MORE GROUPS, CALCULATE GROUP MEAN FOR EACH D-VAR.
list thisgroup = tworkd*mask[N.NewAxis,:]
tuple groupmns = amean(N.compress(mask,thisgroup),1)
tuple errors = errors-N.multiply.outer(groupmns,mask)
 THEN SUBTRACT THEM FROM APPROPRIATE SUBJECTS.

Function Documentation

def b3::lib::statlib::anova::aanova (   data,
  effects = ['A',
  B,
  C,
  D,
  E,
  F,
  G,
  H,
  I,
  J,
  K 
)
    Prints the results of single-variable between- and within-subject ANOVA
    designs.  The function can only handle univariate ANOVAs with a single
    random factor.  The random factor is coded in column 0 of the input
    list/array (see below) and the measured variable is coded in the last
    column of the input list/array. The following were used as references
    when writing the code:

    Maxwell, SE, Delaney HD (1990)  Designing Experiments and Analyzing
Data, Wadsworth: Belmont, CA.
    Lindman, HR (1992) Analysis of Variance in Experimental Design,
Springer-Verlag: New York.

    TO DO:  Increase Current Max Of 10 Levels Per W/I-Subject Factor
    Consolidate Between-Subj Analyses For Between And Within/Between
    Front-end for different input data-array shapes/organization
    Axe mess of 'global' statements (particularly for Drestrict fcns)

    Usage:   anova(data,                         data = |Stat format
           effects=['A','B','C','D','E','F','G','H','I','J','K'])

    Note: |Stat format is as follows ... one datum per row, first element of
    row is the subject identifier, followed by all within/between subject
    variable designators, and the measured data point as the last element in the
    row.  Thus, [1, 'short', 'drugY', 2, 14.7] represents subject 1 when measured
    in the short / drugY / 2 condition, and subject 1 gave a measured value of
    14.7 in this combination of conditions.  Thus, all input lists are '2D'
    lists-of-lists.
    
def b3::lib::statlib::anova::Dfull_model (   workd,
  subjslots 
)
RESTRICTS NOTHING (i.e., FULL MODEL CALCULATION).  Subtracts D-variable
   cell-mean for each between-subj group and then calculates the SS array.
def b3::lib::statlib::anova::Drestrict_mean (   workd,
  subjslots 
)
RESTRICTS GRAND MEAN.  Subtracts D-variable cell-mean for each between-
   subj group, and then adds back each D-variable's grand mean.
def b3::lib::statlib::anova::Drestrict_source (   workd,
  subjslots,
  source 
)
   Calculates error for a given model on array workd.  Subjslots is an
   array of 1s and 0s corresponding to whether or not the subject is a
   member of that between-subjects variable combo.  source is the code
   for the type of model to calculate.  source=-1 means no restriction;
   source=0 means to restrict workd's grand mean; source>0 means to
   restrict the columns of the main data array, DA, specified (in binary)
   by the source-value.

   Usage:   Derrorcalc(workd,subjslots,source)  source:-1=nothing, 0=mean
   Returns: SS array for multivariate F calculation
   
def b3::lib::statlib::anova::F_value_wilks_lambda (   ER,
  EF,
  dfnum,
  dfden,
  a,
  b 
)
   Calculation of Wilks lambda F-statistic for multivarite data, per
   Maxwell & Delaney p.657.

   Usage:   F_value_wilks_lambda(ER,EF,dfnum,dfden,a,b)
   
def b3::lib::statlib::anova::makebin (   sourcelist)
def b3::lib::statlib::anova::makelist (   source,
  ncols 
)
def b3::lib::statlib::anova::member (   factor,
  source 
)
def b3::lib::statlib::anova::multivar_SScalc (   workd)

Save it so you don't have to calculate it again next time.

def b3::lib::statlib::anova::numbitson (   a)
def b3::lib::statlib::anova::numlevels (   source,
  Nlevels 
)
def b3::lib::statlib::anova::propersubset (   a,
  b 
)
def b3::lib::statlib::anova::round4 (   num)
def b3::lib::statlib::anova::setsize (   source)
def b3::lib::statlib::anova::subset (   a,
  b 
)
def b3::lib::statlib::anova::subtr_cellmeans (   workd,
  subjslots 
)
   Subtract all cell means when within-subjects factors are present ...
   i.e., calculate full-model using a D-variable.
   

Variable Documentation

tuple b3::lib::statlib::anova::adjsourcecols = N.array(makelist(source-1,Nfactors+1))
tuple b3::lib::statlib::anova::all_cellmeans = N.transpose(DM[dindex],[-1]+range(0,len(DM[dindex].shape)-1))
tuple b3::lib::statlib::anova::all_cellns = N.transpose(DN[dindex],[-1]+range(0,len(DN[dindex].shape)-1))

Create a list of all unique values in each column, and a list of these Ns.

list b3::lib::statlib::anova::BNs = pstat.colex([Nlevels],Bscols[1:])
tuple b3::lib::statlib::anova::btwcols = map(Bscols.index,Lsource)
tuple b3::lib::statlib::anova::btwidx = N.take(idx,N.array(Bscols))
tuple b3::lib::statlib::anova::btwnonsourcedims = (N.array(map(Bscols.index,Lbtwnonsourcedims))-1)
tuple b3::lib::statlib::anova::btwsourcecols = (N.array(map(Bscols.index,Lsource))-1)
tuple b3::lib::statlib::anova::Bwithins = findwithin(data)

Within-subj factors defined as those where there are fewer subj than scores in the first level of a factor (quick and dirty; findwithin() below)

tuple b3::lib::statlib::anova::cdata = pstat.collapse(data,range(Nfactors+1),-1,None,None,mean)
Initial value:
00001 [[[1]],
00002                     [[-1,1]],
00003                     [[-1,0,1],[1,-2,1]],
00004                     [[-3,-1,1,3],[1,-1,-1,1],[-1,3,-3,1]],
00005                     [[-2,-1,0,1,2],[2,-1,-2,-1,2],[-1,2,0,-2,1],[1,-4,6,-4,1]],
00006                     [[-5,-3,-1,1,3,5],[5,-1,-4,-4,-1,5],[-5,7,4,-4,-7,5],
00007                      [1,-3,2,2,-3,1],[-1,5,-10,10,-5,1]],
00008                     [[-3,-2,-1,0,1,2,3],[5,0,-3,-4,-3,0,5],[-1,1,1,0,-1,-1,1],
00009                      [3,-7,1,6,1,-7,3],[-1,4,-5,0,5,-4,1],[1,-6,15,-20,15,-6,1]],
00010                     [[-7,-5,-3,-1,1,3,5,7],[7,1,-3,-5,-5,-3,1,7],
00011                      [-7,5,7,3,-3,-7,-5,7],[7,-13,-3,9,9,-3,-13,7],
00012                      [-7,23,-17,-15,15,17,-23,7],[1,-5,9,-5,-5,9,-5,1],
00013                      [-1,7,-21,35,-35,21,-7,1]],
00014                     [[-4,-3,-2,-1,0,1,2,3,4],[28,7,-8,-17,-20,-17,-8,7,28],
00015                      [-14,7,13,9,0,-9,-13,-7,14],[14,-21,-11,9,18,9,-11,-21,14],
00016                      [-4,11,-4,-9,0,9,4,-11,4],[4,-17,22,1,-20,1,22,-17,4],
00017                      [-1,6,-14,14,0,-14,14,-6,1],[1,-8,28,-56,70,-56,28,-8,1]],
00018                     [[-9,-7,-5,-3,-1,1,3,5,7,9],[6,2,-1,-3,-4,-4,-3,-1,2,6],
00019                      [-42,14,35,31,12,-12,-31,-35,-14,42],
00020                      [18,-22,-17,3,18,18,3,-17,-22,18],
00021                      [-6,14,-1,-11,-6,6,11,1,-14,6],[3,-11,10,6,-8,-8,6,10,-11,3],
00022                      [9,-47,86,-42,-56,56,42,-86,47,-9],
00023                      [1,-7,20,-28,14,14,-28,20,-7,1],
00024                      [-1,9,-36,84,-126,126,-84,36,-9,1]]]
tuple b3::lib::statlib::anova::coeffmatrix = N.ones(mns.shape,N.Float)
tuple b3::lib::statlib::anova::collapsed = pstat.collapse(M,btwcols,-1,None,len,mean)

Save it so you don't have to calculate it again next time.

Initial value:
00001 pstat.collapse(collapsed,btwsourcecols,-1,None,None,
00002                                              harmonicmean)
tuple b3::lib::statlib::anova::contrastmns = pstat.collapse(collapsed,btwsourcecols,-2,sterr,len,mean)
Initial value:
00001 pstat.collapse(collapsed,btwsourcecols,-1,None,None,
00002                                             N.sum)
tuple b3::lib::statlib::anova::D = N.zeros(Nwsources,N.PyObject)
tuple b3::lib::statlib::anova::d_en = float(p**2 + (k-1)**2 - 5)
tuple b3::lib::statlib::anova::datavals = pstat.colex(data,-1)
float b3::lib::statlib::anova::dfden = Nsubjects-N.multiply.reduce(N.ravel(BNs))
tuple b3::lib::statlib::anova::dfnum = N.multiply.reduce(N.ravel(N.array(sourceNs)-1))
Initial value:
00001 N.zeros(list(tsubjslots.shape[0:-1]) +
00002                                   [variables],'f')
Initial value:
00001 N.zeros(list(tsubjslots.shape[0:-1]) +
00002                                   [variables],'f')
tuple b3::lib::statlib::anova::dvarshape = N.array(N.take(mns.shape,Lwithinsourcecol[1:]))

Calculate full-model sums of squares.

Calc this effect (a(j)'s, b(k)'s, ab(j,k)'s, whatever)

tuple b3::lib::statlib::anova::er = Drestrict_mean(workD,subjslots)

THEN SUBTRACT THEM FROM APPROPRIATE SUBJECTS.

Initial value:
00001 asum((sourceMarray*sourceNarray)/
00002                                 asum(sourceNarray))

Calc grand average (ga), used for ALL effects.

tuple b3::lib::statlib::anova::groupmns = amean(N.compress(mask,thisgroup),1)
tuple b3::lib::statlib::anova::hn = aharmonicmean(Narray,-1)
tuple b3::lib::statlib::anova::idxlist = pstat.unique(pstat.colex(M,btwcols))
tuple b3::lib::statlib::anova::indxlist = N.swapaxes(N.reshape(idxarray,newshape),0,1)
tuple b3::lib::statlib::anova::k = N.multiply.reduce(N.ravel(BNs))

DO SS CALCS ON THE OUTPUT FROM THE SOURCE=0 AND SOURCE=-1 CASES.

this section expects workd to have subj. in LAST dimension!!!!!!

tuple b3::lib::statlib::anova::lmbda = LA.determinant(ef)
tuple b3::lib::statlib::anova::loopcap = N.array(tsubjslots.shape[0:-1])

HERE BEGINS THE MAXWELL & DELANEY APPROACH TO CALCULATING SS.

tuple b3::lib::statlib::anova::M = pstat.collapse(data,Bscols,-1,None,None,mean)

WHILE STILL MORE GROUPS, CALCULATE GROUP MEAN FOR EACH D-VAR.

tuple b3::lib::statlib::anova::Narray = N.zeros(Nblevels[1:],'f')
tuple b3::lib::statlib::anova::Ncells = N.multiply.reduce(Nlevels[1:])
Initial value:
00001 N.array([idxarray.shape[0],
00002                                     N.multiply.reduce(idxarray.shape[1:])])
list b3::lib::statlib::anova::nlevels = coeffmatrix.shape[0]
Initial value:
00001 (outputlist
00002             # These terms are for the numerator of the current effect/source
00003                           + [[thiseffect, round4(SS),dfnum,
00004                               round4(SS/float(dfnum)),round4(f),
00005                               round4(prob),suffix]]
00006             # These terms are for the denominator for the current effect/source
00007                           + [[thiseffect+'/w', round4(SSw),dfden,
00008                               round4(SSw/float(dfden)),'','','']]
00009                           + [['\n']])
int b3::lib::statlib::anova::p = workD.shape[1]
tuple b3::lib::statlib::anova::prefixcols = range(len(collapsed[0][:-3]))
tuple b3::lib::statlib::anova::RSinter = N.zeros((levels,levels),N.PyObject)

Calculate SSw, within-subj variance (Lindman approach)

tuple b3::lib::statlib::anova::sourcecols = makelist(source-1,Nfactors+1)

Average Marray over non-source dimensions.

Calculate harmonic means for each level in source.

If GRAND interaction, use harmonic mean of ALL cell Ns.

Average Marray over non-source dimensions (1=keep squashed dims)

Calculate harmonic means for each level in source.

If GRAND interaction, use harmonic mean of ALL cell Ns.

tuple b3::lib::statlib::anova::sourceNs = pstat.colex([Nlevels],makelist(source-1,Nfactors+1))

RESTRICT COLUMNS/DIMENSIONS SPECIFIED IN source (BINARY) (i.e., is the value of source not equal to 0 or -1?)

Initial value:
00001 asum((effect**2 *sourceNarray) *
00002                           N.multiply.reduce(N.take(Marray.shape,btwnonsourcedims)))

Save it so you don't have to calculate it again next time.

Calc and save sums of squares for this source

tuple b3::lib::statlib::anova::ssval = N.add.reduce(workd[i]*workd[j])

Calc all SUBSOURCES to be subtracted from sourceMarray (M&D p.320)

Make a list of the non-subsource dimensions.

Make a list of the non-subsource dimensions.

Initial value:
00001 (N.add.reduce(tsubjslots[idx] * # 1=subj dim
00002                                               N.transpose(D[dcount]),1) /
00003                                  DNarray[idx])
tuple b3::lib::statlib::anova::tidx = range(1,len(subjslots.shape))
list b3::lib::statlib::anova::title = [['FACTORS: ','RANDOM'] + effects[:Nfactors]]

OUTPUT FINAL RESULTS (ALL SOURCES TOGETHER) Note: All 3 types of source-calcs fall through to here.

tuple b3::lib::statlib::anova::tworkd = N.transpose(D[dindex])
Initial value:
00001 asum(sourceDNarray,
00002                               range(len(sourceDMarray.shape)-1))

Calc grand average (ga), used for ALL effects.

tuple b3::lib::statlib::anova::W = math.pow(lmbda,(1.0/s))
list b3::lib::statlib::anova::workD = D[Bwonly_sources.index(sourcewithins)]

Set up workD and subjslots for upcoming calcs.

 All Classes Namespaces Files Functions Variables Properties