Possible issue with promax normalization? #92

Xunius · 2021-10-19T08:33:34Z

Describe the bug

Thanks for creating the tool.
Firstly, I'm still learning the relevant maths so I could be wrong.

I guess my question regarding the normalization of promax rotation is 2-fold (I'll try my best to make it clear):

the normalization by the diag_inv matrix (if I understood it correctly, this maintains the unit-length of rotated factors, and works in each column).
the "kaiser" normalization, which does the length-normalization in each row.

It is mentioned that these rotations are largely adopted from an R package. As I do not know any R so I can't really cross-check.
I've gone through this paper:

Hendrickson, A.E. and White, P.O. (1964), PROMAX: A QUICK METHOD FOR ROTATION TO OBLIQUE SIMPLE STRUCTURE. British Journal of Statistical Psychology, 17: 65-70. https://doi.org/10.1111/j.2044-8317.1964.tb00244.x

Which states that:

The columns of L are normalized such that their sums of squares are equal to unity.

This is supposed to be the 1st normalization I mentioned above.

I also found relevant descriptions in this paper:

Richman, Michael B., 1986: Rotation of principal components. Journal of Climatology, 3, 293-335

where this normalization is achieved using this equation:

So this D1 matrix corresponds to the diag_inv variable in your code.

My concern is that diag_inv calculated in the current manner does not give unit length:

    try:
        diag_inv = np.diag(np.linalg.inv(np.dot(coef.T, coef)))
    except np.linalg.LinAlgError:
        diag_inv = np.diag(np.linalg.pinv(np.dot(coef.T, coef)))

    # transform and calculate inner products
    coef = np.dot(coef, np.diag(np.sqrt(diag_inv)))

Instead, I feel that it should be calculated as:

    diag_inv = np.diag(np.dot(coef.T, coef))
    diag_inv = np.diag(1 / np.sqrt(diag_inv))

I gave a MWE in the below section demonstrating this.

Regarding the 2nd Kaiser normalization, the Hendrickson 1964 paper only mentioned:

Each element of this matrix is, except for sign which remains unchanged, the kth power of the corresponding element in the row-column normalized orthogonal matrix.

without specifying how it is "row-column normalized".

According to the Richman 1986 paper, this row-normalization is done after obtaining the varimax rotated results, in contrary to the implementation in this tool which is normalized before doing varimax. In the same paper the columns are also further normalized by the maximum absolute value, before raising to the k-th power.

I didn't managed to find the exact reference of the R code so I can't comment on that.

In summary, could you help:

check out the 1st normalization and see if there is indeed an issue, and
guide me to some references where the Kaiser normalization is described in more details, possibly the R code as well (I'll try to guess the logic)?

To Reproduce

The below script illustrates the effect of the 1st normalization (by diag_inv).

A few points:

_varimax() and promax2() are adopated from this tool.
promax() is my own implementation, following the Richman 1986 paper.
I've make a couple of changes to your promax() function, labelled by ## CHANGE at the end of the line.

Click to expand

import numpy as np

def _varimax(loadings, normalize=True, max_iter=200, tol=1e-6):
    """
    Perform varimax (orthogonal) rotation, with optional
    Kaiser normalization.

    Parameters
    ----------
    loadings : array-like
        The loading matrix

    Returns
    -------
    loadings : numpy array, shape (n_features, n_factors)
        The loadings matrix
    rotation_mtx : numpy array, shape (n_factors, n_factors)
        The rotation matrix
    """
    X = loadings.copy()
    n_rows, n_cols = X.shape
    if n_cols < 2:
        return X

    # normalize the loadings matrix
    # using sqrt of the sum of squares (Kaiser)
    if normalize:
        normalized_mtx = np.apply_along_axis(lambda x: np.sqrt(np.sum(x**2)), 1, X.copy())
        X = (X.T / normalized_mtx).T

    # initialize the rotation matrix
    # to N x N identity matrix
    rotation_mtx = np.eye(n_cols)

    d = 0
    for _ in range(max_iter):

        old_d = d

        # take inner product of loading matrix
        # and rotation matrix
        basis = np.dot(X, rotation_mtx)

        # transform data for singular value decomposition
        transformed = np.dot(X.T, basis**3 - (1.0 / n_rows) *
                             np.dot(basis, np.diag(np.diag(np.dot(basis.T, basis)))))

        # perform SVD on
        # the transformed matrix
        U, S, V = np.linalg.svd(transformed)

        # take inner product of U and V, and sum of S
        rotation_mtx = np.dot(U, V)
        d = np.sum(S)

        # check convergence
        if old_d != 0 and d / old_d < 1 + tol:
            break

    # take inner product of loading matrix
    # and rotation matrix
    X = np.dot(X, rotation_mtx)

    # de-normalize the data
    if normalize:
        X = X.T * normalized_mtx
    else:
        X = X.T

    # convert loadings matrix to data frame
    loadings = X.T.copy()

    return loadings, rotation_mtx

def promax(loadings, k=2, normalize=True):

    n, r = loadings.shape

    # do varimax
    vmax, rot_mtx = _varimax(loadings, normalize=False)

    # normalize rows
    G = vmax / (1e-6 + np.sqrt(np.sum(vmax**2, axis=1, keepdims=True)))

    # normalize columns
    A = G / np.nanmax(abs(G), axis=0)

    # get B
    B = np.sign(A) * abs(A)**k

    # get T1
    T1 = np.linalg.inv(np.dot(vmax.T, vmax))
    T1 = np.dot(np.dot(T1, vmax.T), B)

    # get D1
    D1 = np.diag(np.dot(T1.T, T1))
    D1 = np.diag(1/np.sqrt(D1))

    # get T
    T = np.dot(T1, D1)

    # rotate
    res = np.dot(loadings, T)

    return res, T

def promax2(loadings, k=2, normalize=True):
    """
    Perform promax (oblique) rotation, with optional
    Kaiser normalization.

    Parameters
    ----------
    loadings : array-like
        The loading matrix

    Returns
    -------
    loadings : numpy array, shape (n_features, n_factors)
        The loadings matrix
    rotation_mtx : numpy array, shape (n_factors, n_factors)
        The rotation matrix
    psi : numpy array or None, shape (n_factors, n_factors)
        The factor correlations
        matrix. This only exists
        if the rotation is oblique.
    """

    X = loadings.copy()
    n_rows, n_cols = X.shape
    if n_cols < 2:
        return X

    if normalize:
        # pre-normalization is done in R's
        # `kaiser()` function when rotate='Promax'.
        array = X.copy()
        h2 = np.diag(np.dot(array, array.T))
        h2 = np.reshape(h2, (h2.shape[0], 1))
        weights = array / (1e-6+np.sqrt(h2))  ## CHANGE

    else:
        weights = X.copy()

    # first get vawhateverrimax rotation
    X, rotation_mtx = _varimax(weights, normalize=False)  ## CHANGE
    Y = X * np.abs(X)**(k - 1)

    # fit linear regression model
    coef = np.dot(np.linalg.inv(np.dot(X.T, X)), np.dot(X.T, Y))

    # calculate diagonal of inverse square
    try:
        diag_inv = np.diag(np.linalg.inv(np.dot(coef.T, coef)))
    except np.linalg.LinAlgError:
        diag_inv = np.diag(np.linalg.pinv(np.dot(coef.T, coef)))

    # transform and calculate inner products
    coef = np.dot(coef, np.diag(np.sqrt(diag_inv)))
    z = np.dot(X, coef)

    if normalize:
        # post-normalization is done in R's
        # `kaiser()` function when rotate='Promax'
        z = z * np.sqrt(h2)

    rotation_mtx = np.dot(rotation_mtx, coef)

    coef_inv = np.linalg.inv(coef)
    phi = np.dot(coef_inv, coef_inv.T)

    # convert loadings matrix to data frame
    loadings = z.copy()
    return loadings, rotation_mtx, phi

if __name__ == '__main__':

    # create some dummy data
    N = 20  # num of observations
    M = 9   # num of features
    r = 3   # return this number of factors

    np.random.seed(10)
    data = np.random.rand(N, M)
    S = np.dot(data.T, data)

    # get an orthogonal matrix eigs, with vectors in columns
    eigvalues, eigs = np.linalg.eigh(S)

    print('eigs.shape', eigs.shape)
    print('eig values', eigvalues)
    print('check length:', np.linalg.norm(eigs[:, :r],axis=0))
    print('check orthogonal:')
    print(np.dot(eigs[:, :r].T, eigs[:, :r]))

    promax_factors1, rot1 = promax(eigs[:, :r])
    promax_factors2, rot2, _ = promax2(eigs[:, :r])

    print('lengths of promax_factors1:', np.linalg.norm(promax_factors1, axis=0))
    print('lengths of promax_factors2:', np.linalg.norm(promax_factors2, axis=0))
    print('diag of rot1:')
    print(np.diag(np.dot(rot1.T, rot1)))
    print('diag of rot2:')
    print(np.diag(np.dot(rot2.T, rot2)))

Screenshots

Output from the above script:

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible issue with promax normalization? #92

Possible issue with promax normalization? #92

Xunius commented Oct 19, 2021 •

edited

Loading

Possible issue with promax normalization? #92

Possible issue with promax normalization? #92

Comments

Xunius commented Oct 19, 2021 • edited Loading

Xunius commented Oct 19, 2021 •

edited

Loading