The broader scholarship in anthropology has provided some context to how the socioeconomic and linguistic hierarchies in the Indian subcontinent have created access and participation barriers for many marginalized communities. Furthermore, studies also help understand more the direct impact of systemic socioeconomic privileges (or the lack of them) on the agency of such communities affected by technology. Many new and emerging technologies, such as the Distributed Ledger Technology (DLT), are often touted as open and distributed by design. In theory and at the outset, DLT does have the potential to promote openness and accountability from a technical standpoint. However, user sovereignty, wider access to information and fair and equitable content monetization in relation to socioeconomic dominance need further investigation. In our research, we have attempted to understand the web content ecosystem in two indigenous Adivasi languages – Ho and Santali – and how the current challenges and opportunities correspond to the future of web content monetization using DLT. It is important to note that Ho and Santali speakers have historically been oppressed in India, most notably through the Hindu caste system as it percolates through access to wealth, education and livelihoods. The affordability of smartphones and computers for creating content, the access to the internet, knowhow about monetizing content, the affordability of the content users to pay for content are some of the factors we studies in our work.
- Building a small research team: Prasanta Hembram (Santali-language content creator active in editing Wikipedia, localization and user documentation of many open-source software, and speech data creator using Mozilla Common Voice) and Ganesh Birua (Ho-language content creator active in promoting Ho on social media, blogs and other web platforms) joined with Subhashish (founder, OpenSpeaks, an open project aimed to building resources and strategies for multimedia content creators in low-medium-resource languages) to further this study.
- Public and open research hosted on Meta-Wiki, a Wikimedia project licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
- Creation of hypothesis: Development of a broad range of hypotheses based on prior knowledge.
- Desk research on the web content ecosystem keeping in mind socioeconomic, technological, educational and livelihoods factors.
- Interim research findings presented at the Mozilla Festival 2021.
- Surveys and in-depth interviews conducted with two groups of citizen content creators who have created their own web channels (YouTube, blogs, Facebook Pages/groups and Twitter handles). Analysis of this study helped elucidate the status quo of the web content ecosystems of these two languages and how they contrast with other dominant langage counterparts, and indicated the gaps and opportunities while shedding lights on the strategy behind content monetization.
- RightsCon 2022 proposal selected to unveil the research findings.
- Team preparing to submit a book proposal with the final outcomes.
- Design sprint: literature review, development of "how might we..." questions and preparation for Creative Commons Global Summit 2021 presentation
- Presented a session at the Creative Commons Global Summit 2021
- Participation by Prasanta Hembram at a Santali cultural event and documentation of anecdotal insights from content creators
- Organized a panel discussion at the United Nations Internet Governance Forum (IGF) titled "Building the wiki-way for low-resource languages
- Co-organizing a workshop on media creation in the Ho language by Subhashish P.
- Participation in the MozFest Trustworthy AI Working Groups program and development of openly-licensed (CC0 1.0 licensed) Natural Language Processing data in Ho and Santali.
- Presented at MozFest 2022: "Low-resource languages, and their open source AI/ML solutions through a radical empathy lens"
- Submission titled “Should web content monetization be allowed for Indigenous languages?” accepted for RightsCon 2022.
- Throughout the entire span of this study we have presented at four international conferences that focus respectively on open licenses, Openness and digital rights advocacy: Creative Commons Global Summit 2021, Mozilla Festival 2021 and 2022, and RightsCon 2022. We also have participated and presented at the United Nations Internet Governance Forum, an international conference that brings together governments, civil society actors, citizens and private organizations for high-level inter-country-level discourse about an open, distributed, decentralized and multi-stakeholder-led internet. We have organized an in-person workshop, in collaboration with an organization working towards the development of the Ho language, to study further the strong coherence between societies and the internet, especially the reducing social spaces during the pandemic. Additionally, beyond the scope of this project, we have contributed towards building wordlists both in the Ho and Santali languages in collaboration with the communities to help future work in Natural Processing Language (NLP), particularly for creation of spell-check and speech data, which is essential for further content development. This was beyond the original planned objectives and received support from Mozilla under a separate project.
We plan to publish the main report with the research outcomes and recommendations on the existing open research page on Meta-Wiki. The report would also have the executive summary in both Ho and Santali for the respective speakers to access the content of the report with ease. We had created an interim research report, apart from the interim grant report, that we had submitted the ACM FaccT Conference but were unsuccessful to present. We are working on furthering the report as a book proposal and would explore publishing it through a publisher.