Libraries.Compute.Statistics.Tests.CompareCounts Documentation

This class implements several proportion comparison hypothesis tests. In addition to the comparison tests, this class can be directed to run any assumption tests and post-hoc tests that are typically accompanied with the given comparison test. Controls for specific post hoc analysis approaches or corrections can be used via this class as well. See the INFORMATION comment block at the bottom of this class for more information about each test. For more information: https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test

Example Code

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareCounts

DataFrame frame
frame:Load("Data/Data.csv")
frame:AddSelectedColumnRange(0,2)

CompareCounts compare = frame:CompareCounts()
output compare:GetSummary()

Inherits from: Libraries.Compute.Statistics.DataFrameCalculation, Libraries.Compute.Statistics.Tests.StatisticalTest, Libraries.Language.Object, Libraries.Compute.Statistics.Inputs.ColumnInput, Libraries.Compute.Statistics.Inputs.FactorInput

Actions Documentation

AddColumn(integer column)

This action adds a value to the end of the input.

Parameters

  • integer column

AddFactor(integer column)

This action adds a value to the end of the input.

Parameters

  • integer column

Calculate(Libraries.Compute.Statistics.DataFrame frame)

An array of results, typically only one result except in pairwise tests

Parameters

Compare(Libraries.Language.Object object)

This action compares two object hash codes and returns an integer. The result is larger if this hash code is larger than the object passed as a parameter, smaller, or equal. In this case, -1 means smaller, 0 means equal, and 1 means larger. This action was changed in Quorum 7 to return an integer, instead of a CompareResult object, because the previous implementation was causing efficiency issues.

Parameters

Return

integer: The Compare result, Smaller, Equal, or Larger.

Example

Object o
Object t
integer result = o:Compare(t) //1 (larger), 0 (equal), or -1 (smaller)

CompareSeveralCounts(Libraries.Compute.Statistics.DataFrame frame)

This action represents a chi-squared test of independence test on two or more columns of data. It calculates the observed values by counting the frequencies of unique items. It then calculates the expected counts and compares the two to get the x2 value. H0: The two variables are independent. Ha: The two variables are not independent.

Parameters

Example


    use Libraries.Compute.Statistics.DataFrame
    use Libraries.Compute.Statistics.Tests.CompareCounts

    DataFrame frame
    frame:Load("data.csv")
    frame:AddSelectedColumns(0)
    frame:AddSelectedColumns(1)

    CompareCounts compare = frame:CompareCounts()
    output compare:GetSummary()

CompareSeveralRelatedCounts(Libraries.Compute.Statistics.DataFrame frame)

This action represents a McNemar-Bowker Test of Symmetry on three or more columns of data. It calculates the observed values by counting the frequencies of unique items. It then calculates the expected counts and compares the two to get the x2 value. H0: The two variables are independent. Ha: The two variables are not independent.

Parameters

Example


    use Libraries.Compute.Statistics.DataFrame
    use Libraries.Compute.Statistics.Tests.CompareCounts

    DataFrame frame
    frame:Load("data.csv")
    frame:AddSelectedColumns(0)
    frame:AddSelectedColumns(1)

    CompareCounts compare = frame:CompareRelatedCounts()
    output compare:GetSummary()

CorrectFamilyWiseError(boolean correctFamilyWiseError)

Strict method is the default for most tests if another is not selected

Parameters

  • boolean correctFamilyWiseError

CorrectFamilyWiseError()

Returns true for correction

Return

boolean:

EmptyColumns()

This action empty's the list, clearing out all of the items contained within it.

EmptyFactors()

This action empty's the list, clearing out all of the items contained within it.

Equals(Libraries.Language.Object object)

This action determines if two objects are equal based on their hash code values.

Parameters

Return

boolean: True if the hash codes are equal and false if they are not equal.

Example

use Libraries.Language.Object
use Libraries.Language.Types.Text
Object o
Text t
boolean result = o:Equals(t)

GetColumn(integer index)

This action gets the item at a given location in an array.

Parameters

  • integer index

Return

integer: The item at the given location.

GetColumnIterator()

This action gets an iterator for the object and returns that iterator.

Return

Libraries.Containers.Iterator: Returns the iterator for an object.

GetColumnSize()

This action gets the size of the array.

Return

integer:

GetDegreesOfFreedom()

This returns the degrees of freedom if only one result exists.

Return

number: the Degrees of Freedom.

GetExpected()

This returns the expected frame if only one result exists.

Return

Libraries.Compute.Statistics.DataFrame: the expected frame.

GetExperimentalDesign()

This is the class that holds all design selections and design frame.

Return

Libraries.Compute.Statistics.Tests.ExperimentalDesign:

GetFactor(integer index)

This action gets the item at a given location in an array.

Parameters

  • integer index

Return

integer: The item at the given location.

GetFactorIterator()

This action gets an iterator for the object and returns that iterator.

Return

Libraries.Containers.Iterator: Returns the iterator for an object.

GetFactorSize()

This action gets the size of the array.

Return

integer:

GetFormalSummary()

This action summarizes the results and places them into formal academic language, in APA format. For more information: https://apastyle.apa.org/instructional-aids/numbers-statistics-guide.pdf

Return

text:

GetGroups(Libraries.Compute.Statistics.DataFrame frame)

Gets the the fully factored samples/groups in an array of dataframes. Using an array of dataframes instead of a single dataframe helps with multivariate cases.

Parameters

Return

Libraries.Containers.HashTable:

GetHashCode()

This action gets the hash code for an object.

Return

integer: The integer hash code of the object.

Example

Object o
integer hash = o:GetHashCode()

GetObserved()

This returns the observed frame if only one result exists.

Return

Libraries.Compute.Statistics.DataFrame: the observed frame.

GetPairwiseResults()

This returns the pairwise results if only one result exists. Pairwise results are only calculated in N-sample tests, otherwise this will return undefined.

Return

Libraries.Containers.Array: the pairwise results.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareMeans
use Libraries.Compute.Statistics.Reporting.CompareCountsResult
    
DataFrame frame
frame:Load("Data/Data.csv")
    
CompareCounts compare
compare:Add(0)
compare:Add(1)
compare:Add(2)
compare:Add(3)
compare:TestPairwise()
frame:Calculate(compare)

Array<CompareMeansResult> pairwise = compare:GetPairwiseResults()

GetPairwiseSummary()

This returns the pairwise summary if only one result exists. Pairwise results are only calculated in N-sample tests, otherwise this will return nothing.

Return

text: the pairwise summary.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareCounts
    
DataFrame frame
frame:Load("Data/Data.csv")
    
CompareCounts compare
compare:Add(0)
compare:Add(1)
compare:Add(2)
compare:Add(3)
compare:TestPairwise()
frame:Calculate(compare)

output compare:GetPairwiseSummary()

GetProbabilityValue()

This returns the probability if only one result exists.

Return

number: the P-Value.

GetReport(Libraries.System.File file)

This creates an HTML page with the results as its contents.

Parameters

GetResiduals()

This returns the residuals frame if only one result exists.

Return

Libraries.Compute.Statistics.DataFrame: the residuals frame.

GetResult()

This returns a result if only one exists. If there are more than one, this action returns undefined.

Return

Libraries.Compute.Statistics.Reporting.CompareCountsResult: the CompareCountsResult.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareCounts
    
DataFrame frame
frame:Load("Data/Data.csv")
frame:AddSelectedColumns("region")
CompareCounts compare = frame:CompareSelectedCounts()

CompareCountsResult result = compare:GetResult()

GetResults()

This returns the results between all computed columns.

Return

Libraries.Containers.Array: the CompareCountsResults.

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareCounts
    
DataFrame frame
frame:Load("Data/Data.csv")
    
CompareCounts compare
compare:AddColumn(0)
compare:AddColumn(1)
compare:AddColumn(2)
frame:Calculate(compare)

Array<CompareCountsResult> results = compare:GetResults()

GetSignificanceLevel()

A list of unique items of the factor

Return

number:

GetStatisticalFormatting()

GetSummary()

This action summarizes the results and lists them informally.

Return

text:

GetTestStatistic()

This returns the x2 test statistic if only one result exists.

Return

number: the x2 test statistic.

GoodnessOfFit(Libraries.Compute.Statistics.DataFrame frame)

This action represents a goodness of fit chi-squared test on a selected columns of data. It calculates the observed values by counting the frequencies of unique items. It then calculates the expected counts (expecting an equal distribution) and compares the two to get the x2 value. H0: The population fits a uniform distribution. Ha: The population does not fit a uniform distribution.

Parameters

Example


    use Libraries.Compute.Statistics.DataFrame
    use Libraries.Compute.Statistics.Tests.CompareCounts

    DataFrame frame
    frame:Load("data.csv")
    frame:AddSelectedColumns(0)

    CompareCounts compare = frame:CompareCounts()
    output compare:GetSummary()

GoodnessOfFitAgainstExpectedCounts(Libraries.Compute.Statistics.DataFrame frame, Libraries.Compute.Statistics.DataFrame expected)

This action represents a goodness of fit chi-squared test on a single column of data. It calculates the observed values by counting the frequencies of unique items. Then it compares the observed with the user-supplied expected counts. H0: The population fits the given distribution. Ha: The population does not fit the given distribution.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareCounts
    
DataFrame frame
frame:Load("Data/Data.csv")
frame:AddSelectedColumns("smoker")

TextColumn category
category:Add("yes")
category:Add("no")

NumberColumn count
count:Add(60)
count:Add(50)

DataFrame expected
expected:AddColumn(category)
expected:AddColumn(count)

CompareCounts compare
compare:GoodnessOfFitAgainstExpectedCounts(frame, expected)
compare:GetSummary()

GoodnessOfFitAgainstExpectedPercents(Libraries.Compute.Statistics.DataFrame frame, Libraries.Compute.Statistics.DataFrameColumn percents)

This action represents a goodness of fit chi-squared test on one or more columns of data. For each column, it calculates the observed values by counting the frequencies of unique items. Then it compares the observed with the user-supplied expected percentages. The percentages must add up to 1.0, and there must be a percent for each category. H0: The population fits the given distribution. Ha: The population does not fit the given distribution.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareCounts
    
DataFrame frame
frame:Load("Data/Data.csv")
frame:AddSelectedColumns("smoker")

NumberColumn percent
percent:Add(0.4)
percent:Add(0.6)

CompareCounts compare
compare:GoodnessOfFitAgainstExpectedPercents(frame, percent)
compare:GetSummary()

GoodnessOfFitAgainstExpectedPercents(Libraries.Compute.Statistics.DataFrame frame, Libraries.Compute.Statistics.DataFrame percents)

This action represents a goodness of fit chi-squared test on one or more columns of data. For each column, it calculates the observed values by counting the frequencies of unique items. Then it compares the observed with the user-supplied expected percentages. The percentages must add up to 1.0, and there must be a percent for each category. H0: The population fits the given distribution. Ha: The population does not fit the given distribution.

Parameters

Example

use Libraries.Compute.Statistics.DataFrame
use Libraries.Compute.Statistics.Tests.CompareCounts
    
DataFrame frame
frame:Load("Data/Data.csv")
frame:AddSelectedColumns("smoker")

TextColumn category
category:Add("yes")
category:Add("no")

NumberColumn percent
percent:Add(0.4)
percent:Add(0.6)

DataFrame expected
expected:AddColumn(category)
expected:AddColumn(percent)

CompareCounts compare
compare:GoodnessOfFitAgainstExpectedPercents(frame, expected)
compare:GetSummary()

IsEmptyColumns()

This action returns a boolean value, true if the container is empty and false if it contains any items.

Return

boolean: Returns true when the container is empty and false when it is not.

IsEmptyFactors()

This action returns a boolean value, true if the container is empty and false if it contains any items.

Return

boolean: Returns true when the container is empty and false when it is not.

Paired()

Used in 2-sample tests

Return

boolean:

Paired(boolean paired)

Used in 2-sample tests

Parameters

  • boolean paired

RemoveColumn(integer column)

This action removes the first occurrence of an item that is found in the Addable object.

Parameters

  • integer column

Return

boolean: Returns true if the item was removed and false if it was not removed.

RemoveColumnAt(integer index)

This action removes an item from an indexed object and returns that item.

Parameters

  • integer index

RemoveFactor(integer column)

This action removes the first occurrence of an item that is found in the Addable object.

Parameters

  • integer column

Return

boolean: Returns true if the item was removed and false if it was not removed.

RemoveFactorAt(integer index)

This action removes an item from an indexed object and returns that item.

Parameters

  • integer index

RepeatedMeasures()

Used in N-sample tests

Return

boolean:

RepeatedMeasures(boolean repeatedMeasures)

Used in N-sample tests

Parameters

  • boolean repeatedMeasures

SetExperimentalDesign(Libraries.Compute.Statistics.Tests.ExperimentalDesign design)

SetSignificanceLevel(number significanceLevel)

Sets the significance level of the test (default is 0.05).

Parameters

  • number significanceLevel: the significance level between 0 and 1.

SetStatisticalFormatting(Libraries.Compute.Statistics.Reporting.StatisticsFormatting formatting)

Create a new frame based on that list

Parameters

TestPairwise(boolean test)

Used in N-sample tests

Parameters

  • boolean test

TestPairwise()

Used in N-sample tests

UseFittedApproach(boolean useFittedApproach)

Choose fitted (unplanned) approach pairwise comparisons for N-sample pairwise tests

Parameters

  • boolean useFittedApproach

UseLenientCorrection(boolean useLenientCorrection)

Choose lenient multiple comparison as correction for N-sample pairwise tests

Parameters

  • boolean useLenientCorrection

UseStrictCorrection(boolean useStrictCorrection)

Choose strict pairwise comparison as correction for N-sample pairwise tests

Parameters

  • boolean useStrictCorrection

UseUnfittedApproach(boolean useUnfittedApproach)

Choose unfitted (planned) approach pairwise comparisons for N-sample pairwise tests

Parameters

  • boolean useUnfittedApproach

UsingFittedApproach()

Returns true for multiple comparisons to use the model as a reference for N-sample pairwise tests

Return

boolean:

UsingLenientCorrection()

Returns true for lenient multiple comparison as correction for N-sample pairwise tests

Return

boolean:

UsingStrictCorrection()

Returns true for strict pairwise comparison as correction for N-sample pairwise tests

Return

boolean:

UsingUnfittedApproach()

Returns true for multiple comparisons to use individual tests for N-sample pairwise tests

Return

boolean: