Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible issue with promax normalization? #92

Open
Xunius opened this issue Oct 19, 2021 · 0 comments
Open

Possible issue with promax normalization? #92

Xunius opened this issue Oct 19, 2021 · 0 comments

Comments

@Xunius
Copy link

Xunius commented Oct 19, 2021

Describe the bug

Thanks for creating the tool.
Firstly, I'm still learning the relevant maths so I could be wrong.

I guess my question regarding the normalization of promax rotation is 2-fold (I'll try my best to make it clear):

  1. the normalization by the diag_inv matrix (if I understood it correctly, this maintains the unit-length of rotated factors, and works in each column).
  2. the "kaiser" normalization, which does the length-normalization in each row.

It is mentioned that these rotations are largely adopted from an R package. As I do not know any R so I can't really cross-check.
I've gone through this paper:

Hendrickson, A.E. and White, P.O. (1964), PROMAX: A QUICK METHOD FOR ROTATION TO OBLIQUE SIMPLE STRUCTURE. British Journal of Statistical Psychology, 17: 65-70. https://doi.org/10.1111/j.2044-8317.1964.tb00244.x

Which states that:

The columns of L are normalized such that their sums of squares are equal to unity.

This is supposed to be the 1st normalization I mentioned above.

I also found relevant descriptions in this paper:

Richman, Michael B., 1986: Rotation of principal components. Journal of Climatology, 3, 293-335

where this normalization is achieved using this equation:

image

So this D1 matrix corresponds to the diag_inv variable in your code.

My concern is that diag_inv calculated in the current manner does not give unit length:

    try:
        diag_inv = np.diag(np.linalg.inv(np.dot(coef.T, coef)))
    except np.linalg.LinAlgError:
        diag_inv = np.diag(np.linalg.pinv(np.dot(coef.T, coef)))

    # transform and calculate inner products
    coef = np.dot(coef, np.diag(np.sqrt(diag_inv)))

Instead, I feel that it should be calculated as:

    diag_inv = np.diag(np.dot(coef.T, coef))
    diag_inv = np.diag(1 / np.sqrt(diag_inv))

I gave a MWE in the below section demonstrating this.

Regarding the 2nd Kaiser normalization, the Hendrickson 1964 paper only mentioned:

Each element of this matrix is, except for sign which remains unchanged, the kth power of the corresponding element in the row-column normalized orthogonal matrix.

without specifying how it is "row-column normalized".

According to the Richman 1986 paper, this row-normalization is done after obtaining the varimax rotated results, in contrary to the implementation in this tool which is normalized before doing varimax. In the same paper the columns are also further normalized by the maximum absolute value, before raising to the k-th power.

I didn't managed to find the exact reference of the R code so I can't comment on that.

In summary, could you help:

  1. check out the 1st normalization and see if there is indeed an issue, and
  2. guide me to some references where the Kaiser normalization is described in more details, possibly the R code as well (I'll try to guess the logic)?

To Reproduce

The below script illustrates the effect of the 1st normalization (by diag_inv).

A few points:

  • _varimax() and promax2() are adopated from this tool.
  • promax() is my own implementation, following the Richman 1986 paper.
  • I've make a couple of changes to your promax() function, labelled by ## CHANGE at the end of the line.
Click to expand
import numpy as np

def _varimax(loadings, normalize=True, max_iter=200, tol=1e-6):
    """
    Perform varimax (orthogonal) rotation, with optional
    Kaiser normalization.

    Parameters
    ----------
    loadings : array-like
        The loading matrix

    Returns
    -------
    loadings : numpy array, shape (n_features, n_factors)
        The loadings matrix
    rotation_mtx : numpy array, shape (n_factors, n_factors)
        The rotation matrix
    """
    X = loadings.copy()
    n_rows, n_cols = X.shape
    if n_cols < 2:
        return X

    # normalize the loadings matrix
    # using sqrt of the sum of squares (Kaiser)
    if normalize:
        normalized_mtx = np.apply_along_axis(lambda x: np.sqrt(np.sum(x**2)), 1, X.copy())
        X = (X.T / normalized_mtx).T

    # initialize the rotation matrix
    # to N x N identity matrix
    rotation_mtx = np.eye(n_cols)

    d = 0
    for _ in range(max_iter):

        old_d = d

        # take inner product of loading matrix
        # and rotation matrix
        basis = np.dot(X, rotation_mtx)

        # transform data for singular value decomposition
        transformed = np.dot(X.T, basis**3 - (1.0 / n_rows) *
                             np.dot(basis, np.diag(np.diag(np.dot(basis.T, basis)))))

        # perform SVD on
        # the transformed matrix
        U, S, V = np.linalg.svd(transformed)

        # take inner product of U and V, and sum of S
        rotation_mtx = np.dot(U, V)
        d = np.sum(S)

        # check convergence
        if old_d != 0 and d / old_d < 1 + tol:
            break

    # take inner product of loading matrix
    # and rotation matrix
    X = np.dot(X, rotation_mtx)

    # de-normalize the data
    if normalize:
        X = X.T * normalized_mtx
    else:
        X = X.T

    # convert loadings matrix to data frame
    loadings = X.T.copy()

    return loadings, rotation_mtx

def promax(loadings, k=2, normalize=True):

    n, r = loadings.shape

    # do varimax
    vmax, rot_mtx = _varimax(loadings, normalize=False)

    # normalize rows
    G = vmax / (1e-6 + np.sqrt(np.sum(vmax**2, axis=1, keepdims=True)))

    # normalize columns
    A = G / np.nanmax(abs(G), axis=0)

    # get B
    B = np.sign(A) * abs(A)**k

    # get T1
    T1 = np.linalg.inv(np.dot(vmax.T, vmax))
    T1 = np.dot(np.dot(T1, vmax.T), B)

    # get D1
    D1 = np.diag(np.dot(T1.T, T1))
    D1 = np.diag(1/np.sqrt(D1))

    # get T
    T = np.dot(T1, D1)

    # rotate
    res = np.dot(loadings, T)

    return res, T

def promax2(loadings, k=2, normalize=True):
    """
    Perform promax (oblique) rotation, with optional
    Kaiser normalization.

    Parameters
    ----------
    loadings : array-like
        The loading matrix

    Returns
    -------
    loadings : numpy array, shape (n_features, n_factors)
        The loadings matrix
    rotation_mtx : numpy array, shape (n_factors, n_factors)
        The rotation matrix
    psi : numpy array or None, shape (n_factors, n_factors)
        The factor correlations
        matrix. This only exists
        if the rotation is oblique.
    """

    X = loadings.copy()
    n_rows, n_cols = X.shape
    if n_cols < 2:
        return X

    if normalize:
        # pre-normalization is done in R's
        # `kaiser()` function when rotate='Promax'.
        array = X.copy()
        h2 = np.diag(np.dot(array, array.T))
        h2 = np.reshape(h2, (h2.shape[0], 1))
        weights = array / (1e-6+np.sqrt(h2))  ## CHANGE

    else:
        weights = X.copy()

    # first get vawhateverrimax rotation
    X, rotation_mtx = _varimax(weights, normalize=False)  ## CHANGE
    Y = X * np.abs(X)**(k - 1)

    # fit linear regression model
    coef = np.dot(np.linalg.inv(np.dot(X.T, X)), np.dot(X.T, Y))

    # calculate diagonal of inverse square
    try:
        diag_inv = np.diag(np.linalg.inv(np.dot(coef.T, coef)))
    except np.linalg.LinAlgError:
        diag_inv = np.diag(np.linalg.pinv(np.dot(coef.T, coef)))

    # transform and calculate inner products
    coef = np.dot(coef, np.diag(np.sqrt(diag_inv)))
    z = np.dot(X, coef)

    if normalize:
        # post-normalization is done in R's
        # `kaiser()` function when rotate='Promax'
        z = z * np.sqrt(h2)

    rotation_mtx = np.dot(rotation_mtx, coef)

    coef_inv = np.linalg.inv(coef)
    phi = np.dot(coef_inv, coef_inv.T)

    # convert loadings matrix to data frame
    loadings = z.copy()
    return loadings, rotation_mtx, phi

if __name__ == '__main__':

    # create some dummy data
    N = 20  # num of observations
    M = 9   # num of features
    r = 3   # return this number of factors

    np.random.seed(10)
    data = np.random.rand(N, M)
    S = np.dot(data.T, data)

    # get an orthogonal matrix eigs, with vectors in columns
    eigvalues, eigs = np.linalg.eigh(S)

    print('eigs.shape', eigs.shape)
    print('eig values', eigvalues)
    print('check length:', np.linalg.norm(eigs[:, :r],axis=0))
    print('check orthogonal:')
    print(np.dot(eigs[:, :r].T, eigs[:, :r]))

    promax_factors1, rot1 = promax(eigs[:, :r])
    promax_factors2, rot2, _ = promax2(eigs[:, :r])

    print('lengths of promax_factors1:', np.linalg.norm(promax_factors1, axis=0))
    print('lengths of promax_factors2:', np.linalg.norm(promax_factors2, axis=0))
    print('diag of rot1:')
    print(np.diag(np.dot(rot1.T, rot1)))
    print('diag of rot2:')
    print(np.diag(np.dot(rot2.T, rot2)))

Screenshots

Output from the above script:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant