An answer from the Vietnamese Wikipedia community regarding machine translation

sửa

(This is a message from ThiênĐế98)

@Pginer-WMF: Hi Pginer, I'm so sorry I can't answer you in English because my writing skill is not good enough. Miss Bluetpp will help me translate my message into English and send it to you.

First of all, I want to correct the statistics that you brought out in your message. Contrary to the number you've got, the number of deleted translation is very high, possibly more than 50%, in the writing contest that we've been holding. The reason is because we can't monitor the contest closely enough as we don't have enough manpower, the eternal problem of our community. Now we are in a progress of deleting translation with poor quality with the aim of "purifying" our project (we are gonna do it for real next week).

To answer your first question: As an examiner of the writing contests in the past few months, I identify "fake" translation by meaningless words, excessive prepositions translated from English, weird terminologies and unfamiliar grammar structure throughout the translation. Most of the initial translation come from the writing contests as the users are all after quantity and don't have the professional knowledge required; sometimes they each publish more than 10 articles/day and each contest month we have more than 200 articles published. Most of them are in bad quality, having serious problem in translating. I estimate there are about 3-4.000 articles "accumulate" throughout the contest months from June to now.

To answer your second question: Now our board is taking measures of warning users with fake translation, after 3 warnings we are gonna lock their accounts for a week. All these regulations have been agreed on by our community to prevent this raging problem. Regarding the translation, we put the "Articles for deletion" sign in the articles and require the user the edit the articles; if in a week they don't do it the articles will be put in Sandbox for 3 days and deleted after that, in accordance with what have been agreed on by our community. I think that warning the authors is very important, and we'd go as far as preventing them from publishing unmodified translation to stop translation with poor quality.

After talking with user Bluetpp, I have postpone the talk about limits with the community because there're still so much that we don't understand. I hope we can continue this plan in 2020 with the aid of other sysops (since I have a personal plan at the end of this year).

Thank you for caring about the translation problem occuring in our comminuty, it seriously causes us a lot of trouble. Personally the passionate user and I are trying to reform thoroughly the machine translation problem and improve the consequences of the writing contests, maybe in this year or next year. I hope that the measures that you bring to us can help "cure" our project and restore our credibility with Vietnamese readers.

My personal idea, from ThiênĐế98, a member of the board.


Hi, I'm Bluetpp, I'm a Vietnamese Wikipedian as well as an Ambassador for the Growth team's Newcomer Project. We just want to ask you about the limits: we want to apply it but we currently don't know for sure of what to do and how to do it. Can you walk us through and tell us what should we discuss first with the community? Tiểu Phương #Talk2me 14:49, ngày 17 tháng 11 năm 2019 (UTC)Trả lời

Hi @Bluetpp: Currently, the limits can be adjusted in the tool configuration files. This is something the Language team can adjust if there is a specific request. I'd recommend trying to make different test translations with Content Translation in order to identify which would be the ideal limit, as described here. You can make a request for the Language team to adjust at the talk page of the project. Please let me know if there is any other question about the process. Thanks!