Author: lars

TikTok – what is the problem?

Last Friday, I was interviewed by the Swedish television (the local Uppland channel) about the reasons for this and the possible dangers with the application from a security perspective. The interview can be found here but it is only in Swedish. Therefore I will describe the problem in this post as well.

The municipality of Uppsala, together with a large number of other public actors (also in many other countries) have recently prohibited the use of TikTok on the work spade. Apart from that some people might think that there are very limited reasons to why you should need access to TikTok on your mobiles at all during your work, why would you prohibit the use of TikTok, when you can still use Youtube, Instagram and Facebook? What is different with TikTok?

There are actually some reasons for this, both the prohibition and the differences between the application. TikTok is an application that allows the users to record short videos (max 3 min) and publish these on the TikTok platform. This has become very popular among, above all, young people. There is also an ongoing critical discussion about the social aspects of the TikTok application, but it is not part of this post.

When the application is installed, it asks the user for permission to access photo and video storage, the camera and the microphone, which is of course quite reasonable, since the purpose of the app is exactly to record videos and store them in the user’s phone. However, it also asks for access to the contact lists, and the current location when used. And here is one of the problems, namely that this data is given by the user to an application we know very little about. But, one may object, this data is not dangerous, we give it to almost any of the social media applications (actually, that might be something we should not do either, without some consideration).

The data the users provide is, however, actually not that innocent as it might seem at first sight. If the application can collect the data as mentioned above, the data might form a much bigger collection of “innocent” data, which is not as innocent anymore. It contains your contacts, the places where you have been, and also when you where there. If the data of different people are correlated on the whole data set, there might be patterns that could show interesting things for people who are specifically interested. It could for example show regular visits to certain locations, or even that you meet some people regularly. Still, who would be interested of this information? Not everything might be interesting, but suppose that you are engaged in a civil defense organization. Then the meeting places, the people you meet at those meetings, and who these people meet in other contexts might be very important information for a possible enemy. So, there are quite a few people in a city that could be of interest in this kind of analysis.

But, as mentioned above, this information is provided to many different applications, so why has TikTok been singled out like this? Well, there is one additional argument for this, namely that it is very important that we know where the information is going in the larger perspective. This is where the history of TikTok becomes relevant. TikTok is owned by the Chinese company ByteDance, one of the biggest startup companies in the world, and this is where the main problem starts. The statement in the Privacy Policy gives an indication (my boldface added):

We may disclose any of the Information We Collect to respond to subpoenas, court orders, legal process, law enforcement requests, legal claims, or government inquiries, and to protect and defend the rights, interests, safety, and security of the Platform, our affiliates, users, or the public. We may also share any of the Information We Collect to enforce any terms applicable to the Platform, to exercise or defend any legal claims, and comply with any applicable law. 

TiKToK Privacy Policy

The text in boldface provides a key to the problems. The data can be released in certain situations, which are not under our control. In 2017 China implemented a law that compels companies to turn over personal data relevant to China’s security. The question is then what this data might be. Depending on the situation, information pertaining other countries’ military and/or civil defenses might be very relevant to another country. The company can, through its mother company be forced to hand out any information under the conditions mentioned.

What are the odds? It is difficult to say, of course. However, since TikTok is not crucial to the work in public organizations, there is no reason even to take the chances. Especially in the current situation where there is such unrest over most of the world, there is definitely a reason for being careful in general with the handing out of data.

But, there are also some drawbacks with the general ban on the application. As mentioned above, the application is mostly used by young people. This also means that the proper use of the application can become an entry point to different youth groups, which could be invaluable to certain groups in the municipality, such as social workers, schools, and not least libraries. The libraries have used the application for some time to spread information to young people under the hashtag #BookTok, which allegedly has been very popular. This will now become difficult to handle with the current ban. Of course there may be ways around this ban, but in my opinion it all goes to show that a ban on an application like this has to be carefully considered, and that there should be an awareness of that there could be cases where there has to be exceptions. And, not least, there is a need for more information to potential users of social media about the possible risks that follow the usage.

To quote a famous detective in a famous TV-series:

Let’s be careful out there….

Can we use AI-generated pictures for serious purposes?

The current extremely rapid development within Artificial Intelligence is the matter of a widespread debate, and in most cases it is discussed in terms of being potential dangers to humanity and providing increased possibilities for students to cheat on examinations. When it comes to Artificial Intelligence based art or image generators (AIAG) the questions are mostly focused on similar negative issues, such as whether it really is art or if this is even going to render artists out of business. For the topic of this blog, however, I will reverse the direction of these discussion to take on a more positive and, hopefully more constructive perspective on Artificial Intelligence (*).

A small girl being very afraid of riding the elevator. Her anxiety may become a large problem for her unless treated in an early stage.

Real prompt: a small girl in an elevator, the girl looks very afraid and stands in one corner, the buttons of the elevator are in a vertical row, pencil drawing

The interesting thing is that we don’t focus the discussions more on the possibilities for these tools to be really useful, adding positively to the work. However, I will in this blog post give an example of where the use of AIAG:s as a toolcan be very important within the area of health care, and more specifically within child psychiatry. The examples are collected from an information presentation for parents to children who suffer from panic disorder. The person who has asked for the illustrations works as a counselor at a psychiatric unit for children and young people (BUP) in Sweden. Using the popular, and very powerful AI art generation application MidJourney, I have then produced the different illustrations for the presentation, some of which are now reproduced in this post.

The main captions of the images in this post are taken from the requests made by the counselor, and do not show the actual prompts used, which are in many cases much less comprehensive (shown in smaller type below).

A boy hesitates at the far end of the landing-stage, showing some fear of the evil waves that are trying to catch him.

Real prompt: a boy dressed in swimming trunks::1 is standing at the end of a trampoline::2 , the boy looks anxious and bewildered::1, as if he fears jumping into the water::3, you can see his whole body::1 pencil drawing

It is often difficult to find visual material that is suitable as illustrations in this kind of situations, where there are high requirements on integrity and data safety. Clip art is often quite boring and may also not provide any direct engagement in the viewers. The high demands of integrity delimits the use of stock photos, and copyright issues add further to the problems. Here we can see a very important application area for the Artificial Intelligence Art Generators, since these images are more or less guaranteed not to show any real human beings.

A small girl showing an all
but hidden insecurity, being
alone in the crowd on a town
square.

Real prompt: a small girl is standing in the middle of the town square with lots of people walking by, the girl looks anxious and bewildered, as if she fears being alone, pencil drawing

The images displayed in this post are all produced according to the wishes from the councellor which I have then converted into prompts that produce the desired results. Not all attempts succeeded at once, some images had have the prompts rewritten several times in order to reach the best images. This, of course, points to the simple fact that the role of the prompt writer will be very important in the future illustration creation.

Who does not recognize the classic scare of small children: “There is a wolf in my room!” It could of course also be a monster under the bed, or any other kind of scary imaginations that will prevent the child from sleeping.

Real prompt: a small boy being very anxious when the parent leaves his room for him to sleep, he believes that there is a wolf under his bed, pencil drawing,

In the end, it is also important to point out that a good artist could of course have created all these pictures, and in even better versions. The power of the AIAG:s is, in this example, that it can enable some people to make more and better illustrations as an integrated part of the production of presentations, information material, etc. The alternative is in many cases to just leave out the illustrations, since “I cannot draw anything at all, it just turns ugly”.

Even when there are no monsters in the bedroom, just the parent leaving the child alone, might be enough to invoke a very strong panic, which is difficult for the child to handle.

Real prompt: a small boy being very anxious when the parent leaves his room for him to sleep, pencil drawing

So, to conclude, this was just one example of when Artificial Intelligence systems can be very helpful and productive, if used properly. We just need to start thinking of all the possible usages we can find for the different systems, which, unfortunately is less common than we would want, to some extent due to the large amount of negative articles and discussions that concern the development of AI systems.

====================================

(*) Here in this post the term AI is used mostly in the classic sense of “weak AI”, namely the use of methods that are based on models that are imitating processes within human thinking, which does not necessarily mean that the system is indeed “intelligent”. In this way, the systems mentioned in this post are not really considered by me to be really intelligent, although they may well be advanced enough to emulate an intelligent being.

The real dangers of current “AI”…

Can we believe what we see or read in the future? This polar bear walking in the desert does not exist, but can still affect our thoughts. (Picture: Oestreicher and MidJourney).

Recently there has been an open letter from a number of AI-experts advocating a pause in the development of new AI agents. The main reason for this is the very rapid development of chatbots based on generative networks, e.g., chatGPT and Brad, and a large number of competitors still in the starting blocks. These systems are now also publicly available at a fairly reasonable cost. The essence in the letter is that the current development is too fast for society (and humanity) to cope with it. This is of course an important statement, although we already have the social media, which when used in the wrong way has a serious impact on people in general (such as promoting absurd norms of beauty, or dangerous medical advice spreading in various groups).

The generative AI systems that are under discussion in the letter will undoubtedly have an impact on society, and we are definitely also taken by surprise in many realms already. Discussions are already here on how to “prevent students from cheating on their examinations by using chatGPT (see my earlier post about this here). The problem in that case is not the cheating, but that we teach in a way that makes it possible to cheat with these new tools. To prohibit the use is definitely not the right way to go.

The same holds for the dangers pointed to by the signers of the public letter mentioned above. A simple voluntary pausing of the development will not solve the problem at all. The systems are already here and being used. We will need to see other solutions to these dangers, and most important of all, we will need to study what these dangers really are. From my perspective the dangers have nothing to do with the singularity, or with the AI taking over the world, as some researchers claim. No, I can see at least two types of dangers, one immediate, and one that will/may appear within a few years or a decade.

Fact or fiction?

Did this chipmunk really exist? Well, in Narnia, he was a rat, named Ripipip (Picture: Oestreicher and MidJourney).

The generative AI systems are based on an advanced (basically statistical) analysis of a large number of data, either texts (as in chatBots), or pictures, as in AI art generators, (such as DALL-E or MidJourney). The output from the systems has to be generated with this data as a primary (or only) source. This means that the output will not be anything essentially new, but even more problematic, the models which are the kernel of the systems are completely non-transparent. Even if it is possible to detect some patterns in the in- and output sequences, it is quite safe to say that no human will understand the models themselves.

Furthermore, the actual text collections (or image bases, but I will leave these systems aside for a coming post) on which the systems are based, are not available to the public, which causes the first problem. We, as users, don’t know what the source of a certain detail of the result is based on, whether it is a scientific text or a purely fictitious description in a sci-fi novel. Any text generated by the chatBot needs to be thoroughly scanned with a critical mind, in order not to accept things that are not accurate (or even straightforwardly wrong). Even more problematic is that these errors are not the ones that may be simple to detect. In the words of chatGPT itself:

GPT distinguishes between real and fictitious facts by relying on the patterns and context it has learned during its training. It uses the knowledge it has acquired from the training data to infer whether a statement is likely to be factual or fictional. However, the model’s ability to differentiate between real and fictitious facts is not perfect and depends on the quality and comprehensiveness of the training data.

chatGPT 3.5

And the training data we know very little about. The solution to this problem is most of the time addressed as “wait for the next generation”. The problem here is that the next generation of models will not be more transparent, rather the opposite.

So, how is the ordinary user, who is not an expert in a field, supposed to be able to know whether the answers they get are correct or incorrect? For example, I had chatGPT producing two different texts; one giving the arguments that would prove God’s existence, and one that gave the arguments that would prove that God does not exist. Both versions were very much to the point, but what should we make of it? Today, when there are many topics that are the subjects of heated debates, such as the climate crisis, the necessity of vaccinations, etc., this “objectivity” could be very dangerous if it is not used with a fair amount of critical thinking.

Recursion into absurdity – or old stuff in new containers?

Infinite recursion inspired by M.C. Esher. (Picture: Oestreicher and MidJourney).

As mentioned above, the models are based on large amounts of texts, so far mostly produced by humans. However, today there is a large pool of productivity enhancers that provide AI support for the production of everything, from summaries to complete articles or book chapters. It is quite reasonable to assume that more and more people will start using these services for their own private creations, as well as, hopefully with some caution as per the first problem above, in the professional sphere. We can assume that when there is a tool, people will start using it.

Now, as more and more generated texts will appear on the public scene, it will undoubtedly mix in with the human-created text masses. Since the material for the chatBots needs to be updated regularly in order to keep up with the developments in the world, the generated texts will also slowly but steadily make their way into the materials and in the long run be recycled as new texts adding to the information content. The knowledge produced by the chatBots will be more and more based on the generated texts, and my fear is that this will be a very rapidly accelerating phenomenon that may greatly affect the forthcoming results. In the long run, we may not know whether a certain knowledge is created by humans or by chatbots that generate the new knowledge from the things we already know.

This recursive loop of traversing the human knowledge base mixed with the results from the generative AI-systems may not be as bad as might be considered, but it might also lead to a large amount of absurdity being produced as being factually correct knowledge. In the best case, we can be sure that most of the generated texts in the future will consist of old stuff being repackaged into new packages.

Conclusions

What can be seen through the knowledge lens of the chatbots that are emerging (Picture: Oestreicher and MidJourney).

So, what are my conclusions from this? Should we freeze the development of these systems, as proposed in the open letter? We could, but I do not think that this will solve any problems. We have opened the box of Pandora, and the genie is already out of his bottle. In my perspective, the issue is more on learning how to use this knowledge in order to have it work for us in the best way. Prohibitions and legal barriers have never proved to stop people from doing things. The solution is instead the promotion of knowledge, not least to the main sources of education, and I do not just mean the schools and universities, but journalists and writers in general, as well as people who will be using this.

Already with social media, the problem with “fake news” and “fake science” has been a big problem, but as long as people will regard information from external sources (such as social media, Google searches, facebook or reddit groups, and now chatBots) as truths, and swallow the information from these as plain truths, we can pause the development of GPTs as much as we like and the problem will not go away. We started on this path already with the fast developments of social media, and it will not go away just because we cover our eyes.

So, I urge you as a reader to read this article with a critical mind, and don’t just believe everything that is written here. You know, I just might be completely wrong about this, myself.

AI and How Education Needs to Change

But will the new tools really make it possible to cheat that much? Well, if we maintain the old style of teaching and examining, the answer is undoubtedly “yes”. However, we can also see this as a possibility to improve, or even revolutionize both education and examination. This, of course, need some changes to be implemented. I will explain my thoughts a bit more in the following.

When we look at our teaching obligation, we need to pose the question: “What do we want our students to learn?”. Well, knowledge about the topic at hand, of course. But is that really true? In the first run, what do we define as knowledge? In many cases, the things that appear on the exams are questions about details, details that they will be able to google as soon as they get outside of the examination hall. Home exams are slightly better, since the students will have to synthesize the answers to the exam, rather than just look them up. But now you can ask a program like chatGPT to do the synthesis for you. And is that cheating? In our old apperception of examination, of course it is. What has the student done to get the piece of text written? Not very much!

Is the classical teaching doomed? No, but it needs to adapt to the new conditions. (Source: L. Oestreicher)

However, when we look closer at this, we can change the question a little, and see what happens? The new question would be something in the way of: “How could we change the way of teaching and examination so that this kind of helping tool will not be a cheating possibility (but maybe even a learning tool)?”. My answer to this question is to focus on understanding. My favourite meme for teaching is: “You can lead a camel to the water, but you cannot force it to drink”. As teachers in higher education, teaching will have to focus more on the “How it works” and “Why it works” of the topics, rather than the “How can I implement it”. The students’ understanding of the (role of the) acquired knowledge in the applicable context has to be the most important teaching goal.

But don’t we do this? Some people may already do so, but we still see many exam questions that focus on the student memorizing the content of the course, rather than understanding how to synthesize the answers through their understanding and their skills in reusing this understanding in transferring their knowledge to new domains.

I have in my teaching changed my examination of the students in my courses (one more theoretical, and two practical programming courses) changing the written examination into an oral “discussion”. That may sound like a lot of work, but in fact, it does not take more time than having a written exam. After 30 minutes of this “academic conversation” style of examination, I have most of the time no problem grading the student according to understanding and reasoning, rather than remembering a lot of details (which are most of the time forgotten fairly quickly after the course). This change was in fact introduced many years ago, way before the occurrence of chatGPT and similar systems.

The benefits here are also the new possibilities of actually allowing the students to use any kind of supportive tool, including in this case the chatGPT, for their projects and learning experiences. The only condition that they have to fulfill is that they themselves have to understand the answers they get from the various tools they use. In the programming courses, that, e.g., means they will have to explain any piece of code that they have not written all by themselves. They will also be told that errors that stem from the information source that remain, will affect their grades negatively. This of course applies to both text and code.

With this approach both to teaching and examination we will turn this risk of “cheating” into an improved pedagogical view of courses and the role of the teacher. Of course, it will still require the teacher to be well educated in the topic, in order to both teach and examine the students.

Lars Oestreicher