What is the landscape of natural language? Insights from a random language model
Many complex systems have a generative, or linguistic, aspect: life is written in the language of DNA; protein structure is written in a language of amino acids, and human endeavour is often written in text. Are there universal aspects of the relationship between sequence and structure? I am trying to answer this question using models of random languages. Recently I proposed a model of random context-free languages [1] and showed using simulations that the model has a transition from an unintelligent phase to an ordered phase. In the former, produced sequences look like noise, while in the latter they have a nontrivial Shannon entropy; thus the transition corresponds to the emergence of information-carrying in the language.
In this talk I will explain the basics of natural language syntax, without assuming any prior knowledge of linguistics. I will present the results from the model above, and explain how the model is related to complex matrix models with disorder [2].
[1] https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.122.128301
[2] https://arxiv.org/abs/1902.07516