SSRN Id4685971

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

1 AI-chatbots for agriculture - Where can large language models provide substantial

2 value?

4 Matheus Thomas Kuska a, Mirwaes Wahabzada b and Stefan Paulus* c

5 a Pfeifer & Langen GmbH & Co. KG, Aachener Str. 1042A, 50858 Köln, Germany

6 b BASF Digital Farming GmbH, Im Zollhafen 24, 50678 Köln, Germany

7 c Institute of Sugar Beet Research (IfZ), Holtenser Landstraße 77, 37079 Göttingen, Germany

8 * corresponding author




Abstract er
Since the launch of the “Generative Pre-trained Transformer 3.5”, ChatGPT by Open AI (Open AI
13 Inc., San Francisco, USA), artificial intelligence (AI) has been a main discussion topic in public.
14 Especially large language models (LLM), so called "intelligent" chatbots, and the possibility to
15 automatically generate highly professional technical texts get high attention. It has the potential to

16 impact the entire market, industries, professions as well as science and reporting. Companies, as
17 well as researchers, are evaluating possible applications and how such a powerful LLM can be

18 integrated into daily work and bring benefits, improve their business or to make the research
19 outcome more efficient.

20 In general, underlying models are trained on large datasets, mainly on sources from websites, and

21 online books and articles. In combination with information provided by the user, the model can
22 give an impressively fast response. Even if the range of questions and answers look unrestricted,
23 there are limits to the models. LLMs are not able to interpret large amounts of numerical data.

24 Nevertheless, AI-assistance integration into daily routines holds great potential. These models
25 reach incredible levels of natural-language like text generation. The authors envisage the
26 establishment of LLMs as digital helpers for text development. This depicts a fundamental step

27 to reduce the gap between AI driven data analysis. Furthermore, the LLMs interpretation in terms
28 of semantic and explanation in an understandable way.

This preprint research paper has not been peer reviewed. Electronic copy available at:
29 In this paper, which is not generated by an AI, possible use cases for agricultural tasks are

30 elucidated. This includes: the textual preparation of facts, consulting tasks, interpretation of

31 decision support models in plant disease management, as well as guides for tutorials to integrate

32 modern digital techniques into agricultural work. Opportunities and challenges are described, as

33 well as limitations and insufficiencies. The authors describe a map of easy-to-reach topics in

34 agriculture where the integration of LLMs seems to be very likely within the next few years


36 Keywords. LLM, ChatGPT, AI-assistance, linguistic editing, prompt interpretation, agriculture

37 4.0, digital farming


39 Current global issues in agricultural practice

40 Agriculture must continue to evolve. The sector is tasked with meeting rigorous environmental
41 protection targets and to be sustainable. It is also crucial to increase its recognition in society, such
42 that the many different job opportunities in the sector are more visible and attractive. This is also
43 crucial to ensure that the many different job opportunities in agriculture are responsive and

44 interesting to future generations. Nowadays, the farmer themself must be an allrounder for plant
45 growth, plant protection, animal feeding, economy, legislation etc. (Kuska et al., 2022) - such an

46 allrounder is rare. It is usual that employees on farms work as experts on specialised tasks, but the
47 number of employees on the farms are shrinking due to many reasons (Yoon et al. 2021). This
48 reduces possible time for employee training and for the testing of new ideas or research news.

49 Finally, farmers are spending most time in they daily business as usual to ensure their economic
50 stability. Challenges in today’s traditional, if not rapidly evolving, agriculture must still be
51 overcome. This includes in, for example, crop growing or plant protection. Adoption of

52 innovations in agriculture is a multifaceted process. One facet is the risk profile of the innovation.
53 The perceived risk of an innovation will be related to its complexity and the adopter’s ability to
54 understand it. The farmer and farm employees need to be specially trained for new applications

55 and need a high commitment to the new digital technologies. At this point, LLMs can be a

This preprint research paper has not been peer reviewed. Electronic copy available at:
56 beneficial explainer, trainer for digital agriculture. For example: machine learning methods

57 specific for sensor-based techniques, such as in digital plant pathology, plant phenotyping and
58 remote sensing monitoring, have developed rapidly. As a result, the techniques have experienced
59 rapid innovation from the basic research to their practical application in precision agriculture

60 (summarized in Mahlein et al., 2022). Nevertheless, the outputs are highly mathematical and lack
61 easy access in terms of understanding and interpretation. LLMs specifically are capable of opening
62 easy access to machine learning results, suitably edited even for non-experts. The first experiences
63 with a LLM may be challenging for an inexperienced user, but the large potential will become

64 apparent rapidly. In general, the digitalization of agriculture will change farmers' identity, skills,
65 and work (Zolkin et al., 2021).

66 What are the application fields for such a model and where may its integration first take
67 place?

In public discussions, the current most recognized field of application of a LLM is in explaining
the legal basis of farming. It represents one of the current main topics which are related to new
70 legislations to fight climate change e.g. the Farm to Fork strategy in Europe, but it also shows a
71 real existing use case of a LLM to support the farmer. LLMs can effectively explain legal documents,
72 including the application of specific regional regulations, to clarify the legal implementation of a particular task,
73 such as a plant protection measure, in a way that is easy to understand. This is possible, because the LLM

74 is using a probabilistic tokenization, which compresses the “datasets” (i.e. text) (Chang et al.,
75 2023). In the case of highly sophisticated LLMs such as GPT-3, a data cleaning has been carried

76 out, as well as reinforcement learning from human feedback. This finetuning process ensures that
77 the model provides answers that are more easily understandable. As an example, the authors asked
78 GPT-3.5 about the "Plant Protection Application Regulation" in Germany. It is a 23-page

79 regulation under the Plant Protection Act with complex and cross-referenced interrelationships.
80 GPT-3.5 answers specific requests about the regulation in a few bullet points, which are all
81 understandable and the cross-referenced interrelationships are already unlocked. Unfortunately,

82 laws are often changed, which also requires repeated learning of the LLM in order to be able to
83 accurately give the correct and legally valid answer. This needs a high human effort and a high
84 economic investment for a high-quality model.

This preprint research paper has not been peer reviewed. Electronic copy available at:
85 However, the author's literature research uncovers four big use cases for LLMs in agriculture.

86 Starting with a) consulting and assistance, b) automated documentation, c) explanation and
87 education, d) interpretation of ML results and forecasts (Fig. 1).

88 Consulting and assistance cover the idea of agricultural recommendations. This can help to find

89 the right time point, the right action and the right tool to increase the chance of high yield in the
90 field. Automated documentation describes the process of translation of machine-readable data from
91 farm management systems and machine trackers to human readable text. Explanation and
92 education target the automated generation of handbooks, tutorials, videos and books for self-

93 studies. This is highly recommended when using digital tools which often need a less technical
94 and a more understandable way of introduction. Finally, the interpretation of ML results and

95 forecasts is to give direct decision support, or a context-based interpretation (Fig. 1).

96 Current development of LLMs for agricultural value chain

Latest developments show that for topic experts, LLMs are improved search engines since they
are more specific and focused (Arcila, 2023). For agriculture, this implies that LLM will improve
99 farmer consultation by providing all necessary information e.g. crop cultivation, breeding,
100 machines, or phytopathology to an advisor or to the farmer directly (Fig. 1). The prerequisite for
101 this, however, is properly processed databases that contain validated information. Agriculture

102 consultation requires complex input, as it is dependent not only on abiotic factors such as
103 measurable weather, but also on soil quality/species and the specific farm management. The model
104 systems of the future must be even more capable of recording the user's individual information and

105 classifying and characterizing it correctly. For this purpose, a large number of user profiles must
106 be created in which also further professions inside agriculture are included. Therefore, many start-
107 ups, consulting and even private persons are already using the possibility with non-code chatbot

108 developing (Arawjo et al. 2023), to develop topic specific chat bots. Such systems bring common
109 models for text embedding work, like Word2Vec or BERT and also the basis for retrieval
110 augmented generation, into an almost “drag-and-drop” system. However, the evaluation and

111 debugging of the LLM is still a hand-crafted work and needs high human effort. The higher the
112 topic specialization (e. g. instructions for repairing the loading strip in a sugar beet loading mouse),
113 the more experts are needed who, know and can implement a precise workflow in detail. This, in

114 a somewhat meta situation, gives the LLM the opportunity to train new experts in this field or to

This preprint research paper has not been peer reviewed. Electronic copy available at:
115 create teaching materials (Fig. 1). But to overcome this topic complexity can only be solved using

116 publicly accessible databases, e.g. for consulting.

117 This also shows the limits of such a system. The much-promised functionality of predictions, such
118 as market forecasts, can only be based on the learned market models. In particular, since numerical

119 data from LLM cannot be used directly without further modification of the information. The idea
120 to use a general data-driven learning approach to get new insights on data, or even the prediction
121 model itself, is currently not available. The interpretation of prediction models can nevertheless be
122 supported.

r ev



125 Figure 1: Fields of application of Large Language Models in agriculture, which currently can be

126 used or will be available soon. Biggest potential is seen in automated reporting, technical
127 guidelines, the textual processing of prognosis system output, application decision support,

128 application and consulting advice, textual processing of AI data interpretation results as well as
129 help and advice for use of official applications.

130 What must be developed to increase the impact and trust of LLMs in agricultural practice?

This preprint research paper has not been peer reviewed. Electronic copy available at:
131 One of the first steps should be the development of LLMs to target multilingual extensions. While

132 translations are rather simple as word families are similar across different languages, dialects and
133 technical terms, its transfer to the demands of agriculture is rather complex. This does not include
134 the challenge of transferability, which must be added. Recommendations and conclusions in one

135 language may be conclusive and valid for one specific geographic region but its translation and
136 thus, transfer to another region with different geographical, legal and agronomic conditions is not
137 necessarily right.

138 In order to increase the trust in LLMs, the generated output could be monitored for insufficient

139 information and the resulting impossibility of the system to produce a valid answer. In this context,
140 the LLM output provenance must be shown. Commonly it is totally unclear which data are used

141 to generate the output. The reliability of the output is not necessarily linked to the reliability of the
142 input, and vice versa. Nevertheless, it is an open point which needs discussion and evaluation,

especially not to fall in a total dependency on the leading “tech companies”.

One application to make LLMs more usable in the daily work life, is to use LLMs as an Open-
145 Ended Decoder for vision-centric tasks (Wang et al., 2023). This enables natural language
146 description of pictures, diagrams or charts. The reverse of this is also possible. Text-to-Image
147 Diffusion Models (Zhang et al., 2023) include a visual interpretation of the written text and thus
148 enable automatically generated examples for text paragraphs. The first step here is the publication

149 of GPT-4V (see

150 The LLM is not all you need (Gozalo-Brizuela et al., 2023) as further developments and progress

151 in AI, especially generative approaches, will enable many new opportunities for automation. These
152 systems could be provided to farmers, allowing to automatically create agriculture maps, field
153 overviews, documentations, or commands for agriculture machines and equipment by simply using

154 context relevant prompts. The reduced development and training of AI models through such AI
155 driven solutions is beneficial. It also ensures the farmer stays in charge of their own data. While in
156 LLMs the last layer should include rules of non-discrimization, kindness etc. (Hacker et al., 2023)

157 the output of an agricultural chat bot needs to cover further rule-sets (Fig. 2). Decision
158 recommendations have to be based on the environment since a decision can be right at one place
159 but wrong at another, all this with regard to plant biology and legislation rules. Sources of

160 agricultural knowledge and good agricultural practice need to be included. Nevertheless, in the

This preprint research paper has not been peer reviewed. Electronic copy available at:
161 end all this has to be linked to current weather data and farm metadata about sowing time, variety

162 information and soil information.

163 Although the current attention includes the high performing LLMs that are usable for a large
164 number of applications, they are not working for agriculture. This is primarily due to the lack of a

165 high amount of available agriculture data, actuality of data and because the data is often protected
166 and cannot be shared to such systems or outside specific companies. Approaches for a trusted data
167 integration without the risk of eroding property claims are today described but not implemented
168 (Paulus & Leiding 2023).

169 Besides economical aspects, technical ones will decide how fast and in what way these models
170 will be developed. The current success of LLMs, such as ChatGPT, are due to the high investments

171 made by large tech-companies, for whom the monetization of such models and return of
172 investment is the primary focus. However, national initiatives, such as OpenGPT-X
( are also required to fund research for further progress towards accessible
models for agriculture purposes.



178 Figure 2: Large Language Models are based on a trained language model. A prompt
179 interpretation layer adds the human usability, while a mentoring step implements rules for
180 communication. While chatbots integrate rules for non-discriminative and gentle communication,

181 agricultural chatbots need to integrate agricultural knowledge based on rules, biology,

This preprint research paper has not been peer reviewed. Electronic copy available at:
182 environment and good practical use of farmers. Remaining challenges that need to be addressed

183 of future work are: the question of multilinguality and thus regional applicability;trust in input
184 data and data provenance, accompanied by the question of data property; the integration of text
185 to image and image to text encoding; as well as the factual verifiability..


187 Conclusions

188 The digital transformation and automation of agriculture has gained momentum in recent decades.

189 The current progress in AI, and especially development like LLMs, can help to accelerate
190 automation in agriculture. The authors try to give an overview about where to expect possible
191 application scenarios for the integration of LLMs into practical agriculture work tasks. In addition,

192 required changes of the “moderator stage” in LLMs need to be adapted regarding environmental

knowledge, legislation, climate and good agronomic practice.

Specialized LLMs for agriculture or crop protection will provide better consulting, explanation,
195 interpretation and decision recommendations. LLMs will help to make field monitoring much
196 more interpretable, by transferring images and sensor measurements into a coherent and
197 understandable human language. Nevertheless, a deep integration into agriculture can only succeed
198 if it is ensured that the data behind these models is location dependent, rely on real observations

199 and is up-to-date. The latter aspect will be crucial to ensure such systems always return reliable,
200 trustworthy and correct output, especially if LLMs are used as an agriculture search engine or

201 recommendation supplier.

202 Funding Acquisition


203 This study was partially funded by the Deutsche Forschungsgemeinschaft (DFG, German
204 Research Foundation) under Germany’s Excellence Strategy—EXC 2070-390732324.

205 Acknowledgements

206 The authors are thankful to Laura Zabawa for constructive and valuable remarks and William
207 English for proofreading.


This preprint research paper has not been peer reviewed. Electronic copy available at:
209 References

210 Arcila Beatriz Botero. (2023). Is it a Platform? Is it a Search Engine? It's Chat GPT! The European
211 Liability Regime for Large Language Models. Journal of Free Speech Law 3:2.

212 Arawjo, I., Swoopes, C., Vaithilingam, P., Wattenberg, M., & Glassman, E. (2023). ChainForge:
213 A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing (Version 1). arXiv.

215 Chang Yupeng, Wang Xu, Wang Jindong, Wu Yuan, Yang Linyi, Thu Kaihie, Chen Hao, Yi

216 Xiaoyuan, Wang Cunixiang, Wang Yidong, Ye wei, Zhang Yue, Chang Yi, Yu Philip S., Yang
217 Qiang, Xie Xing. (2023). A Survey on Evaluation of Large Language Models. J. ACM. 37, 4:1-

218 42.

219 Gozalo-Brizuela Roberto & C. Garrido-Merchán Eduardo (2023). ChatGPT is not all you need. A
State of the Art . Review of large Generative AI models. arXiv:2301.04655v1
222 Hacker Philipp, Engel Andreas, Mauer Marco (2023). Regulating ChatGPT and other Large
223 Generative AI Models. FAccT '23: Proceedings of the 2023 ACM Conference on Fairness,
224 Accountability, and Transparency p. 1112–1123 https://doi/10.1145/3593013.3594067

225 Jianyang Deng & Yijia Lin. (2022) The Benefits and Challenges of ChatGPT: An Overview.
226 Frontiers in Computing and Intelligent Systems, 2:2

228 Kuska MT, Heim René, Geedicke Ina, Gold Kaitlin M., Brugger A., Paulus Stefan. (2022). Digital
229 plant pathology: a foundation and guide to modern agriculture. Journal of Plant Diseases and
230 Protection 129:457-468.

231 Mahlein Anne-Katrin, Heim René Hans-Jürgen, Brugger Anna, Gold Kaitlin, Li Yang, Bashir Ali
232 Kashif, Paulus Stefan, Kuska Matheus Thomas (2022). Special Issue: Digital Plant Pathology for
233 Precision Agriculture. Journal of Plant Diseases and Protection 129:455–456


235 Paulus, S., & Leiding, B. (2023). Can Distributed Ledgers Help to Overcome the Need of Labeled

236 Data for Agricultural Machine Learning Tasks? In Plant Phenomics (Vol. 5). American

This preprint research paper has not been peer reviewed. Electronic copy available at:
237 Association for the Advancement of Science (AAAS).


239 Wang Wenhai, Chen Zhe, Chen Xiaokang, Wu Jiannan, Zhu Xizhou, Zeng Gang, Luo Ping, Lu
240 Tong, Zhou Jie, Qiao Yu, Dai Jifeng (2023). VisionLLM: Large Language Model is also an Open-

241 Ended Decoder for Vision-Centric Tasks arXiv:2305.11175
242 (

243 Yoon Bo Kyeong, Tae Hyunhyuk, Jackman Joshua A., Guha Supratik, Kagan Cherie R., Magenot

244 Andrew J., Rowland Diane L., Weiss Paul S., Cho Nam.Joon. (2021. Entrepreneurial Talent
245 Building for 21st Century Agricultural Innovation. ACS Nano 15, 7:10748-10758.

247 Zhang Chenshuang, Zhang Chaoning, Zhang Mengchun, Kweon In So (2023). Text-to-image
Diffusion Models in Generative
( er
AI: A Survey. arXiv:2303.07909
250 Zolkin AL, Burda AG, Avdeev YM, Fakhertdinova DI (2021). The main areas of application of
251 information and digital technologies in the agro-industrial complex. IOP Conf Ser Earth Environ
252 Sci 677:032092.


This preprint research paper has not been peer reviewed. Electronic copy available at:

You might also like