# Unicode languages

## About this network

This bipartite network denotes which languages are spoken in which countries. Nodes are countries and languages; edge weights denote the proportion (between zero and one) of the population of a given country speaking a given language. To quote the Unicode data description: "The main goal is to provide approximate figures for the literate, functional population for each language in each territory: that is, the population that is able to read and write each language, and is comfortable enough to use it with computers."

## Network info

Code | UL |

Category | ⬤ Feature |

Date of origin | 2015 |

Data source | http://www.unicode.org/cldr/charts/25/supplemental/territory_language_information.html |

Vertex type | Country, language |

Edge type | Hosts |

Format | Bipartite |

Edge weights | Positive weights |

Metadata | Zero weight Entity |

Size | 1,122 = 868 + 254 vertices (countries + languages) |

Volume | 1,255 edges (hostss) |

Average degree (overall) | 2.8917 edges / vertex |

Average country degree | 4.9409 edges / vertex |

Average language degree | 2.0440 edges / vertex |

Fill | 0.0070917 edges / vertex^{2} |

Maximum degree | 141 edges |

Wedge count | 21,977 |

Claw count | 521,909 |

Square count | 1,266 |

4-tour count | 86,712 |

Power law exponent (estimated) with d_{min} | 2.3710 (d_{min} = 3) |

Gini coefficient | 58.3% |

Relative edge distribution entropy | 88.9% |

Assortativity | –0.25144 |

Diameter | 8 edges |

90-percentile effective diameter | 5.24 edges |

Mean shortest path length | 4.08 edges |

Spectral norm | 7.9343 |

Algebraic connectivity | 0.00026391 |

## Downloads

TSV file: | unicodelang.tar.bz2 (7.77 KiB) |

Extraction code: | unicodelang.tar.bz2 (25.50 KiB) |