TURLI TILLARDAGI KOMPYUTER LINGVISTIKASI YONDASHUVLARINI TAQQOSLASH

Mualliflar

  • Sanjar Norqobilov

DOI:

https://doi.org/10.47390/SPR1342V3I12.2Y2023N28

Kalit so'zlar:

Mashina tarjimasi, kompyuter lingvistikasi, tabiiy tilni qayta ishlash, tildan tilga tahlil.

Annotatsiya

Ushbu maqola turli tipologik tillarda asosiy kompyuter lingvistikasi texnikalarining ishlashini taqqosiy oʻrganadi. Mashina tarjimasiga (MT) e'tibor qaratib, lingvistik xilma-xillik kompyuter yondashuvlari uchun qanday murakkabliklar tug'dirishini tahlil qiladi. MTni ishlab chiqish uchun umumiy yondashuvdan koʻra, tilga xos moslashtirishlar talab qilinadi. Adabiyot sharhi va turli tillarni taqqoslash orqali soʻz tartibi farqlari, morfologik murakkablik, leksik noaniqlik va yetarli resurslar yetishmasligi kabi muammolar tahlil qilindi. Natijalar arab, xitoy, hind va suahili tillari uchun MT qiyinchiliklarini koʻrsatdi. Muhokama asosan qoidaga asoslangan, statistik va neyron MT texnikalari turli lingvistik xususiyatlar ta'sirida qanday oʻzgarishi, morfologik tahlil va moslashtirilgan ma'lumotlar kerakligi haqida. Bu kompyuter lingvistikasi uchun inkluzivlikning muhimligini koʻrsatadi, inglizchalarni ushlab turishdan voz kechishi kerak. Tadqiqot shuni xulosa qiladiki, samarali algoritmlar dunyodagi taxminan 7000 tilning tuzilishlarini modellashtirish uchun moslashtirish va tilga xos moslashtirish kerak.

Foydalanilgan adabiyotlar

Attia, M., Pecina, P., Toral, A., Tounsi, L. & van Genabith, J. (2012). An open-source finite state morphological transducer for modern standard Arabic. In Proceedings of COLING 2012: Posters (pp. 125-134).

Bentivogli, L., Bisazza, A., Cettolo, M. & Federico, M. (2016). Neural versus phrase-based machine translation quality: a case study. arXiv preprint arXiv:1608.04631.

Boudelaa, S. & Marslen-Wilson, W. (2010). Aralex: A lexical database for Modern Standard Arabic. Behavior Research Methods, 42(2), 481-487.

Doron, E., Arielli, A., Choshen, L. & Dankin, L. (2021). Universal phonemic transcriptional system for endangered language documentation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 2235-2244).

Faruqui, M. & Pado, S. (2012). Towards a model of formal and informal address in Hindi. In Proceedings of the Eighth Workshop on Asian Language Resources (pp. 95-104).

Fransen, A., Bartels, C., Bilionis, I., Heij, V., Landsbergen, S., Embregts, P., ... & Nijholt, A. (2019). Low-resource phoneme recognition u sing transfer learning and a teacher-student curriculum. Proc. Interspeech 2019, 1133-1137.

Güngör, O. & Güngör, T. (2008, June). Disambiguation of Turkish homophones. In International Conference on Computational Linguistics and Intelligent Text Processing (pp. 229-239). Springer, Berlin, Heidelberg.

Habash, N. Y. & Sadat, F. (2006). Arabic preprocessing schemes for statistical machine translation. In Proceedings of the Human Language Technology Conference of the NAACL (pp. 49-52).

Hadash, A., Kermany, E., Wang, C., Petrov, S., & Hajishirzi, H. (2021). Translate without seeing: A script-agnostic approach for translation. arXiv preprint arXiv:2104.08143.

Hayward, K. & Corbett, G. G. (1988). Resolution rules in Qafar. Linguistics, 26(2), 259-284.

Feist, T. & Dwyer, A. (2018). Modeling morphosyntax for endangered language revival. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 836-845).

Hu, M., Peng, Y., Wei, F. & Zhou, M. (2019). Explicit modeling of syntax-aware word meanings for machine translation. arXiv preprint arXiv:1904.00788.

Lakew, S. M., Lotriet, C., Mattiuz, M., & Horváth, T. (2021). Transfer learning for low-resourced languages: A survey. Speech Communication, 135, 88-102.

Yuklashlar

Nashr etilgan

2024-01-06

Havola

Norqobilov, S. (2024). TURLI TILLARDAGI KOMPYUTER LINGVISTIKASI YONDASHUVLARINI TAQQOSLASH. Ижтимоий-гуманитар фанларнинг долзарб муаммолари / Актуальные проблемы социально-гуманитарных наук / Actual Problems of Humanities and Social Sciences., 3(12/2). https://doi.org/10.47390/SPR1342V3I12.2Y2023N28