I guess that all those modifications to get AI to not say anything compromising or do any un-PC observations have fucked it up. Even since we crossed the 60 million parameter level AIs have become a sort of black box where data goes in and result goes out but we're not sure of what the entire process to get that result was, it "just works" most of the time, how the model transforms inputs into outputs—remain opaque. Convolutional layers make it harder for humans to understand what features the model is using for its predictions. Current models are on the trillions of parameters.
So for example sometimes when seeing code I've found comments saying "IDK what this does but when I delete it the entire program stops working". IMHO AI is a lot like this in that if you change something it changes other things. So by keeping the AI giving results that are "not convenient" they're messing up its entire reasoning, it can't say there are more than 2 genders without that fucking up its logic in other areas too.
That's my theory.