Tendencies in the application of artificial intelligence in the processing of photo materials
Analysis of the prospects for using artificial intelligence technologies in the field of photo processing. Research of modern technologies and AI methods that are used in processing photo materials. Review of areas of AI use in image processing.
Рубрика | Программирование, компьютеры и кибернетика |
Вид | статья |
Язык | английский |
Дата добавления | 08.06.2024 |
Размер файла | 1,7 M |
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
Размещено на http://www.allbest.ru/
Размещено на http://www.allbest.ru/
Tendencies in the application of artificial intelligence in the processing of photo materials
Ruslana Buryk
Student, School of Communications,
Media, Arts, and Design of the Centennial College, Toronto, Canada
Abstract
The research highlights the rapid development and increasing adoption of artificial intelligence (AI) technologies into our daily lives, particularly through applications and gadgets such as voice assistants, weather forecasts and advanced photo processing capabilities. This integration of neural networks into everyday tasks demonstrates how artificial intelligence can significantly improve the quality of photo processing, solving the problems of damage, color distortion and other defects. The main goal of this article is to explore the techniques that neural networks offer for restoring, editing and saving important images, as well as to consider the possibilities of their improvement and adaptation to people's needs. Analyzes how the AI image processing process includes several key steps, from image acquisition to image enhancement and restoration. Computer vision plays a critical role in this process, providing tools to optimize images for machine learning. Various methods of processing, compression and expansion of photographic data are considered, which contribute to increasing the efficiency of AI models in the processing of photographic materials. The problem of retraining models and methods for solving it are separately covered, including model simplification, early stopping, data augmentation, regularization, and the dropout method. Such strategies improve the generalizability of the model, providing qualitative results. The findings highlight that artificial intelligence and machine learning are opening up new horizons in image processing, providing tools for face recognition, object detection, and text recognition, as well as contributing to the development of deep learning. It is important to choose the appropriate tools and techniques to achieve optimal results, considering the potential of artificial intelligence and its impact on the future of photo processing.
Keywords: artificial intelligence, photo processing, photo editing, neural network, RGB space, image recognition.
Problem statement
The relevance of the study is explained by the rapid development of digital technologies and the constant growth of visual data which requires effective analysis and processing. Artificial intelligence (AI) and, in particular, deep learning play a key role in solving these problems, as they allow automatization of many processes related to image recognition, classification, restoration, and optimization. The use of AI in image processing opens up new opportunities for various fields, including medicine, security, advertising, social media, and many others. It helps to improve the accuracy of diagnostic tools, develop AI systems with face recognition, optimize content for digital marketing, and improve the quality and accessibility of visual content for the end user. However, despite significant progress in this area, there are challenges which require further research. These include ethics and privacy issues, improving the efficiency of algorithms when processing big data, and developing more versatile AI models that can adapt to different conditions and tasks. Thus, the study of current trends in the use of artificial intelligence in photo processing is important for understanding the current state of technological development, identifying their capabilities and limitations, and also identifying areas for further development and improvement.
Analysis of recent research and publications. Today, artificial intelligence solves tasks in various spheres of life. In the work of M. Ford [9], the theoretical foundations of artificial intelligence are presented, the fields in which AI is used, and the methods of problem solving are described.
A number of authors, including E. Gula, N. Zhuravlyova, O. Mykhailytskyi [1], expand the use of computer vision, including calibration and self-orientation of information receivers, object detection, tracking, work with three-dimensional space, high-precision measurement of scene objects, description of the scene and identification of objects, organization of visual feedback during photo processing.
Scientist G. Bhattacharji [5] understands the purpose of using AI as the formation of conclusions about the elements in the image in the form of their analysis [24]. Thus, A. Chatterjee, E. Cardilo [8] classify the tasks of using AI depending on their practical result: photo image processing, sorting of details, text processing, and others. In this matter, researchers do not distinguish between special and general tasks of artificial intelligence.
In their turn, R. Kristen, L. Diane, K. Catherine and E. Potter [1], believe that when receiving an introductory image, the following tasks are presented: comparison, the resulting image is compared with the existing database of objects and a conclusion is made regarding the belongingness of the image to a certain class; search. A certain fragment is searched for in the received image; restoration. An image is assembled from fragments; classification. Information classes are extracted from the image.
One cannot but agree with Yu. Doroshenko's statement [2] that working with a numerical array is not a processing of photo materials. It is with the help of computational operations on the digital image matrix that it is possible to improve and filter it. Thus, the image is pre-processed in order to further work with it using more complex algorithms.
Therefore, we see that research should be aimed at studying various AI techniques and algorithms used for image enhancement, restoration, editing and classification, with the aim of identifying key trends that will shape the future of this field.
Purpose of the article: The main purpose of the article is the analysis of modern achievements and prospects for the use of artificial intelligence (AI) technologies in the field of photo processing.
Objectives of the study:
- to conduct an analysis of modern technologies and methods of artificial intelligence, which are used in the processing of photo materials, including deep learning;
- to determine the main trends and directions of development of the use of AI in image processing, by focusing on innovative approaches and their possibilities for improving the quality and efficiency of processing photo materials.
- to implement the potential of artificial intelligence in order to solve traditional photo processing tasks, such as noise removal, resolution improvement, object recognition, and also assess the challenges associated with the integration of AI into these processes.
Presenting main material
The use of AI has spread significantly in the practice of processing photo materials, becoming a part of everyday life. Most of us don't even realize how deeply neural networks are integrated into our daily tasks. Modern technologies make it possible to significantly improve the quality of images, compensating for their damage, color distortion or other defects. Our research aims to explore the various techniques that neural networks offer for the recovery, editing and storage of important images, and to focus on the possibilities of improving and adapting such processes to human needs [3].
Artificial intelligence aims to imitate the human mind in order to solve complex problems by creating models that mimic workings of the brain. Neural networks, the elements of which resemble the structure and functions of the human brain, are used to process information. They consist of nodes similar to neurons that interact with each other, thus forming a complex network. Each node can process the input data, performing certain calculations and transmitting the results further along the network. The process of learning the network is based on adapting the weight of connections between nodes based on the processing of input information, which allows the system to improve and perform tasks more efficiently.
In the structure of neural networks, there are several levels between the input and output elements, which are known as hidden layers, inaccessible to direct intervention by AI developers. Each layer in such network is mathematically expressed through functions, while the relationships between elements are represented by weights, forming a matrix describing the relationships between layers. This matrix structure allows to process multiple inputs at the same time, generating appropriate results for each of them. Neural networks can contain from several to hundreds of layers with tens or even thousands of nodes. With the increase in the number of layers and the complexity of the structure, machine learning moves into the field of deep learning, which expands the model's ability to learn and predict, while also improving it over time (Fig. 1) [6].
Fig. 1. Full-connection neural network for processing photographic materials of direct distribution (legend: [No. of nodes, number of links]; W1, W2 =weights of links reaching internal levels 1 and 2, respectively) [6, c. 45]
A basic neural network consists of an input layer, an output (or target) layer, and one or more hidden layers in between. These layers are interconnected by nodes, thus forming an interconnected system known as a neural network, to which the prefix “artificial” is added to emphasize its difference from the natural neural network of the human brain.
In the context of image processing with the help of AI, the process of machine learning includes several stages of data processing before its use. This requires providing a large amount of high-quality data for accurate training and prediction. To optimize images for machine learning, computer vision (CV) plays a key role, allowing machines to interpret visual information from images. The application of CV helps in the processing, digitization, transformation and adaptation of images to create an ideal dataset [12].
As an example, let us consider creating an algorithm to detect the presence of a dog or cat in an image. This requires the collection and pre-processing of dog and cat photos using computer vision, which includes such steps as converting the images to a common format, cropping unimportant areas, and converting the images to a numerical format suitable for processing by machine learning algorithms. The computer perceives the input image as a string of pixels, with their number determined by the image resolution [4]. Thus, it considers the image in the format: “height x width x color depth”. For example, an RGB image of 6 x 6 pixels has a three-dimensional array of “6 x 6 x 3” (where 3 indicates the number of RGB color channels), while a grayscale image of 4 x 4 pixels is represented as an array of “4 x 4 x 1” (Fig. 2).
Fig.2. Examples of recognizing patterns and objects in photos using AI (created by the author in Neural Filters Photoshop)
Next, these features (processed data) are used to select and develop a machine learning algorithm which classifies unrecognized feature vectors based on an extensive database of known feature vectors. Choosing the optimal algorithm is key, with some of the most commonly used techniques including Bayesian networks, decision trees, genetic algorithms, nearest neighbors, and neural networks. In the process of training a model, there may come a point when it stops generalizing from the data and begins to capture random noise, which leads to an increase in errors when predicting new data. The main challenge is to train the model well enough to understand the dependencies between the input and output data, while avoiding overtraining it on the training set.
There are several methods that help avoid overtraining photo processing models [13, 16]:
1) Simplifying the model involves reducing its complexity by eliminating some layers or reducing the number of neurons in the network, making it more compact. When making such changes, it is critical to check how this affects the sizes of input and output data at various stages of computation in the network;
2) Early stopping is to stop training the model when the beginning of performance degradation is noticed on the validation data set, which is a balance between undertraining and overtraining;
3) Data Augmentation involves expanding the volume of training data by modifying existing images, such as flipping, rotating, scaling, or adding noise, which helps improve the model's ability to generalize;
4) Regularization, which includes adding penalties to the loss function to limit the magnitude of the weights. Ll-regularization focuses on minimizing the absolute values of the weights, while L2-regularization focuses on minimizing their squares, the choice between which depends on the complexity of photo materials;
5) The use of the dropout method consists in randomly ignoring some neurons during each training iteration, which simulates the training of several different networks and, as a result, reduces the risk of overtraining, since the model is less prone to dependence on specific features of the input photo materials.
At its core, image processing means making changes to an image in order to improve its quality or to extract information. Digital processing of photographic materials based on AI, uses computer algorithms to edit digital images. In the context of digital processing, the final product can be both the image itself and the data associated with it, such as identified objects, contours, features or masks. Today, photo processing is actively used in areas such as biometrics, video games, medical imaging, video surveillance, autonomous driving, law enforcement, and others. Here are some of the main purposes of processing photographic materials [16]:
-visualization: presents processed photos in a clear and more informative way, making invisible objects visually apparent. Image enhancement and restoration: increases clarity and improves image quality; -image search: makes it easier to identify and find specific images; -object measurement: allows to measure objects in an image;
-object recognition: identifies, classifies objects in a photo, determines their location, and analyzes the scene (Fig. 3).
Fig. 3. An example of image recognition with a watermark [14]
Now let us take a closer look at each of the stages. Image acquisition - this process involves “capturing” an image using a sensor, such as a camera, and converting it into a format suitable for further use (e.g., a digital image file). One well-known method of image acquisition is scraping. Image Enhancement - in this phase, the quality of the captured image is enhanced to extract hidden information from it for further processing. Image restoration - this phase also aims to improve the image quality by eliminating possible defects to obtain a clearer image. The process is based on the use of probabilistic and mathematical models to remove noise, blurring, missing pixels, watermarks, camera focus errors, and other problems that can negatively affect the neural network training process (Fig. 4).
Fig.4. An example of “capturing ” a photo image using a neural network sensor (created by the author in Neural Filters Photoshop)
Color image processing involves working with color images and different color spaces. Depending on the type of image, it is possible to use techniques of pseudo-color processing or processing in RGB space.
Image compression and decompression - this process allows to change the size and resolution of the image. Compression is used to reduce an image file by reducing the resolution and size, while decompression allows to restore the image to its original size and resolution (Fig. 5 and Fig. 6).
These techniques are often used to increase the amount of data in a dataset by slightly augmenting it with images. This allows to improve the ability of the neural network model to generalize information and provide high-quality results.
Morphological processing - this method analyzes the shape and structure of objects in an image, which can be useful in preparing data for training artificial intelligence models. In particular, morphological analysis and processing are used when it is necessary to describe the attributes that the AI model should identify or recognize [2].
Fig.5. Initial photo [5]
Fig. 6. Augmented image elements using AI [5]
Image recognition is the process of identifying specific characteristics and objects in an image. AI image recognition includes techniques such as object detection, object recognition, and segmentation, which are areas where AI-based solutions are highly effective. After completing all the stages of processing photo materials, one can start creating, training and testing real AI models [10].
The process of developing deep learning covers the entire range of actions from data collection to integration of the developed AI model into the final system. Visualization and interpretation are the processes which allow to visualize and describe the processed images. The output of AI-based systems is a set of numbers and values that represent the information, which the model was trained to generate. Typically, deep neural networks alone do not produce output in a human-readable format.
Fig. 7. Morphological processing of photo materials using AI (performed by the author in a software environment Topaz Labs)
However, with specialized visualization tools, these numerical arrays can be transformed into images suitable for detailed analysis. The use of AI and machine learning technologies significantly increases the speed of data processing and the quality of the obtained results. Thanks to AI platforms, it is possible to perform complex tasks such as face recognition, object detection and text recognition. But in order to achieve high-quality results, it is necessary to choose appropriate tools and methods [7].
Deep neural networks containing several layers of hidden neurons face the problem of gradient vanishing, which complicates the process of error backpropagation. This problem can be mitigated by ReLU activation functions, specific initialization of weights, and the use of architectures with pass-through connections. The term “depth” in the context of deep learning refers to the number of layers in a neural network. Therefore, the concept of deep learning is often used as a synonym for deep neural networks. A neural network with three or more layers, including an input and an output, is usually classified as a deep neural network, otherwise as a basic neural network. Neural networks are a very powerful and widely used tool for input data analysis [16].
Camera technologies that use machine learning have the potential to turn our phones into modern digital SLR cameras. Face recognition technologies have been used in digital SLR and mirrorless cameras for years, but the use of artificial intelligence for real-time autofocus makes the process much easier. Thanks to AI, photographers no longer need to track an object using its features, such as a bird's eye - just point the camera and the system will automatically focus [8]. Face recognition technology can analyze the scene in real time, identify presence of people and automatically focus on faces. Al-based software tools such as Narrative.
Select allow you to quickly sort and remove unwanted shots from thousands of images. Photo editing has become extremely simple and productive thanks to tools supported by AI [11]: Adobe Photoshop, Skylum or Topaz Labs. In particular, Neural Filters in Photoshop open new possibilities for AI-powered image creativity, allowing to manipulate photos in seconds - with an ease that was unimaginable just a few years ago. Since neural networks have a large number of parameters, deep networks are prone to retraining on training data, so their regularization is necessary. Regularization methods, in addition to cross-validation, include weight reduction, dropout, truncation, and dataset expansion.
Artificial intelligence has enormous potential in the field of photo processing, opening new horizons for the development of technologies, improving the quality of images and increasing the efficiency of processes. Further innovations in AI are expected to continue to impact photography, providing ever more avant-garde and user-friendly solutions.
Conclusions
The use of artificial intelligence in the processing of photo materials reveals significant progress and has already become an integral part of our daily life, which emphasizes the deep integration of neural networks into various daily tasks. This study examines in detail the wide range of techniques proposed by neural networks for restoring, editing, and preserving photographs, emphasizing the potential of AI to adapt and improve these processes according to human needs. It has been determined that AI has found its application in many aspects of photo processing, from restoring damaged images to compensating for color distortions and other defects. Modern AI technologies are able to significantly improve the quality of photos, providing opportunities to restore details, improve clarity, and correct colors, which was previously unattainable without complex manual interventions.
It has been proven that AI is aimed at imitating the work of the human brain through the creation of complex models and algorithms, which allows solving information processing tasks at a new level of efficiency. The complex architecture of neural networks, including input, output, and hidden layers, plays a key role in the learning and adaptation processes, allowing the system to improve over time. With the increase in the number of layers and the complexity of the structure, machine learning is moving into the area of deep learning, while opening up new opportunities for modeling and prediction.
Key challenges have been identified, including the problem of overtraining, and strategies to overcome them have been discussed, including model simplification, early stopping, data augmentation, regularization, and the use of dropout. The research highlights the potential for AI to radically transform the field of photography, particularly through improvements in real-time autofocus and scaling up of image editing capabilities.
References
artificial intelligence photo processing
1. Gula, Ie., Zhuravlova, N., Mykhailytskyi, O. (2023). Syntez rozvytku knyzhkovoyi hrafiky ta dyzaynu v Ukrayini [Synthesis of Development of Book Graphics and Design in Ukraine]. Kultura і suchasnist: almanakh, 1, 70-75 [in Ukrainian].
2. Doroshenko, Yu. (2023). Mozhlyvosti deformatyvnoho formotvorennya u hrafichnomu dyzayni [Possibilities of deformative formation in graphic design]. Applied Geometry, Engineering Graphics and Intellectual Property Objects, 1(XII), 156-159. [in Ukrainian].
3. Makedon, V. and Ilchenko, N. (2021). Kon"yunktura svitovoho rynku IT-posluh v umovakh ekonomiky 4.0. [World market of it services in the languages of economy 4.0]. Efektyvna ekonomika, [Online], vol. 1, available at: http://www.economy.nayka.com.ua/?op=1& z=8525 (Accessed 29 January 2024). DOI: 10.32702/2307-2105-2021.1.8. [in Ukrainian]
4. Makedon, V. V., Kholod, O. H., Yarmolenko, L. I. (2023). Model' otsinky konkurento- spromozhnosti vysokotekhnolohichnykh pidpryyemstv na zasadakh formuvannya klyuchovykh kompetentsiy [The model of assessing the competitiveness of high-tech enterprises based on the formation of key competencies]. Akademichnyy ohlyad, 2(59), 75-89. DOI: 10.32342/2074-53542023-2-59-5. [in Ukrainian]
5. Bhattacharjee, G. (2023). Art and Photography in the Age of Artificial Intelligence. 12th International Photographic Conference of PAD, Kolkata. Retrieved from: https://www.academia.edu/ 97272074/Art_and_Photography_in_the_Age_of_Artificial_Intelligence (Accessed 29 January 2024). [in English].
6. Brynjolfsson, E. McAfee, A. (2012). Race Against the Machine: How the Digital Revolution is Accelerating Innovation, Driving Productivity, and Irreversibly Transforming Employment and the Economy. Lexington, Massachusetts: Digital Frontier Press. [in English].
7. Chamberlain, R., Mullin, C., Scheerlinck, B., and Wagemans, J. (2018). Putting the art in artificial: Aesthetic responses to computer-generated art. Psychology of Aesthetics, Creativity, and the Arts, 12, 177-192. doi: 10.1037/aca0000136. [in English].
8. Chatterjee, A., and Cardilo, E. (2021). Brain, beauty, and art: Essays bringing neuroaesthetics into focus. Oxford: Oxford University Press. doi: 10.1093/oso/9780197513620.001.0001. [in English].
9. Ford, M. (2021). Rule of the robots: How artificial intelligence will transform everything. Hachette: Basic Books. [in English].
10. Jin, K. H., McCann, M. T., Froustey, E., Unser, M. (2017). Deep convolutional neural network for inverse problems in imaging. IEEE Transactions on Image Processing. 26: 4509-4522. [in English].
11. Kristen, Reid, Diane, L. Butler, Catherine, Comfort & Andrew, D. J. Potter. (2023). Virtual internships in open and distance learning contexts: Improving access, participation, and success for underrepresented students, Distance Education, 44:2, 267-283. DOI: 10.1080/ 01587919.2023.2209029 [in English].
12. Kim, T. (2022). The future of creativity, brought to you by artificial intelligence. Retrieved from: https://venturebeat.com/datadecisionmakers/the-future-of-creativity-brought-to-you-by- artificial-intelligence/ (Accessed 31 January 2024). [in English].
13. McAfee, A., & Brynjolfsson, E. (2017) Machine, Platform, Crowd: Harnessing Our Digital Future. New York: W.W. Norton & Company. [in English].
14. Ozimek, P., Lainas, S., Bierhoff, HW. et al. (2023). How photo editing in social media shapes self-perceived attractiveness and self-esteem via self-objectification and physical appearance comparisons. BMC Psychol., 11, 99. https://doi.org/10.1186/s40359-023-01143-0. [in English].
15. Taylor, J. (2022). No Quick fix: How OpenAI's DALL-E 2 illustrated the challanges of bias in AI. Available online at: https://www.nbcnews.com/tech/tech-news/no-quick-fix- openais-dalle-2-illustrated-challenges-bias-ai-rcna39918. [in English].
16. Zhang, L., Zhang, Q., Wu, M., Yu, J. and Xu, L. (2021). Neural Video Portrait Relighting in Real-time via Consistency Modeling. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 782-792, doi: 10.1109/ICCV48922.2021.00084. [in English].
Размещено на Allbest.ru
Подобные документы
Central Processing Unit. Controls timing of all computer operations. Types of adapter card. Provides quick access to data. Uses devices like printer. Random Access Memory. Directs and coordinates operations in computer. Control the speed of the operation.
презентация [3,5 M], добавлен 04.05.2012Методология, технология и архитектура решения SAP Business Objects. Возможные действия в Web Intelligence. Создание документов и работа с ними. Публикация, форматирование и совместное использование отчетов. Общий обзор приложения, его интерфейсы.
курсовая работа [1,4 M], добавлен 24.09.2015Сущность OnLine Analytical Processing (OLAP). Классификация OLAP-продуктов по способу хранения данных и месту нахождения OLAP-машины. Создание приложения с помощью клиентского инструментального средства. Принципы построения ядра системы анализа данных.
курсовая работа [275,8 K], добавлен 19.07.2012Основа концепции OLAP (On-Line Analytical Processing) – оперативной аналитической обработки данных, особенности ее использования на клиенте и на сервере. Общие характеристика основных требования к OLAP-системам, а также способов хранения данных в них.
реферат [24,3 K], добавлен 12.10.2010Критерии и основные стратегии планирования процессора. Разработка моделей алгоритмов SPT (Shortest-processing-task-first) и RR (Round-Robin). Сравнительный анализ выбранных алгоритмов при различных условиях и различном количестве обрабатываемых данных.
курсовая работа [179,3 K], добавлен 21.06.2013Особенности технологии параллельного программирования, описание компилятора OpenMP (Open Multi-Processing) и MPI (Message Passing Interface). Постановка задачи о ранце и пример ее решения на С++. Решение задачи о ранце на OpenMP со многими потоками.
магистерская работа [1,8 M], добавлен 08.03.2012Вечное хранение данных. Сущность и значение средства OLAP (On-line Analytical Processing). Базы и хранилища данных, их характеристика. Структура, архитектура хранения данных, их поставщики. Несколько советов по повышению производительности OLAP-кубов.
контрольная работа [579,2 K], добавлен 23.10.2010Импорт и копирование растровых образов в CorelDRAW. Преобразование объектов CorelDRAW в растровые образы. Эффекты растровых образов. Применение растровых цветовых масок.
реферат [8,0 K], добавлен 21.12.2003СНПЧ как система непрерывной подачи чернил, применяемая в струйных принтерах. Преимущества и недостатки ее применения. Основные виды СНПЧ. Установка системы непрерывной подачи чернил для принтера EPSON Stylus Photo. Инструкция по заправке СНПЧ чернилами.
статья [2,1 M], добавлен 30.04.2010Классификация информационных систем управления деятельностью предприятия. Анализ рынка и характеристика систем класса Business Intelligence. Классификация методов принятия решений, применяемых в СППР. Выбор платформы бизнес-интеллекта, критерии сравнения.
дипломная работа [1,7 M], добавлен 27.09.2016