@alberto what you are talking about is essentially a non-loseless compression. “Coding” would be the “dictionary” that is generated by the compression, so with reference to the literature produced on the topic by Lempel and Ziv, you can expect that:
- you can estimate the complexity of conversations from the size of the dictionaries that are generated by the same operator when coding them both
- you can decide on their noisiness, chaotic turbulence, or stability, based on how quick the convergence of each own’s dictionary size to a finite limit happens (in the case of pure noise, the limit would be infinite for a machine, and zero in the eyes of a human operator)
- certain algebraic operators can be used to estimates the proximity of different conversations.
Please note that since ethnographic coding is non-loseless, the afore mentioned is true in approximation, but should not be possible to firmly demonstrate any of it (unless I am missing something).
I hope this helps