AI for Business

Explore the best AI for Business — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

Microscope image processing

Microscope image processing is a broad term that covers the use of digital image processing techniques to process, analyze and present images obtained from a microscope. Such processing is now commonplace in a number of diverse fields such as medicine, biological research, cancer research, drug testing, metallurgy, etc. A number of manufacturers of microscopes now specifically design in features that allow the microscopes to interface to an image processing system. == Image acquisition == Until the early 1990s, most image acquisition in video microscopy applications was typically done with an analog video camera, often simply closed circuit TV cameras. While this required the use of a frame grabber to digitize the images, video cameras provided images at full video frame rate (25-30 frames per second) allowing live video recording and processing. While the advent of solid state detectors yielded several advantages, the real-time video camera was actually superior in many respects. Today, acquisition is usually done using a CCD camera mounted in the optical path of the microscope. The camera may be full colour or monochrome. Very often, very high resolution cameras are employed to gain as much direct information as possible. Cryogenic cooling is also common, to minimise noise. Often digital cameras used for this application provide pixel intensity data to a resolution of 12-16 bits, much higher than is used in consumer imaging products. Ironically, in recent years, much effort has been put into acquiring data at video rates, or higher (25-30 frames per second or higher). What was once easy with off-the-shelf video cameras now requires special, high speed electronics to handle the vast digital data bandwidth. Higher speed acquisition allows dynamic processes to be observed in real time, or stored for later playback and analysis. Combined with the high image resolution, this approach can generate vast quantities of raw data, which can be a challenge to deal with, even with a modern computer system. While current CCD detectors allow very high image resolution, often this involves a trade-off because, for a given chip size, as the pixel count increases, the pixel size decreases. As the pixels get smaller, their well depth decreases, reducing the number of electrons that can be stored. In turn, this results in a poorer signal-to-noise ratio. For best results, one must select an appropriate sensor for a given application. Because microscope images have an intrinsic limiting resolution, it often makes little sense to use a noisy, high resolution detector for image acquisition. A more modest detector, with larger pixels, can often produce much higher quality images because of reduced noise. This is especially important in low-light applications such as fluorescence microscopy. Moreover, one must also consider the temporal resolution requirements of the application. A lower resolution detector will often have a significantly higher acquisition rate, permitting the observation of faster events. Conversely, if the observed object is motionless, one may wish to acquire images at the highest possible spatial resolution without regard to the time required to acquire a single image. == 2D image techniques == Image processing for microscopy application begins with fundamental techniques intended to most accurately reproduce the information contained in the microscopic sample. This might include adjusting the brightness and contrast of the image, averaging images to reduce image noise and correcting for illumination non-uniformities. Such processing involves only basic arithmetic operations between images (i.e. addition, subtraction, multiplication and division). The vast majority of processing done on microscope image is of this nature. Another class of common 2D operations called image convolution are often used to reduce or enhance image details. Such "blurring" and "sharpening" algorithms in most programs work by altering a pixel's value based on a weighted sum of that and the surrounding pixels (a more detailed description of kernel based convolution deserves an entry for itself) or by altering the frequency domain function of the image using Fourier Transform. Most image processing techniques are performed in the Frequency domain. Other basic two dimensional techniques include operations such as image rotation, warping, color balancing etc. At times, advanced techniques are employed with the goal of "undoing" the distortion of the optical path of the microscope, thus eliminating distortions and blurring caused by the instrumentation. This process is called deconvolution, and a variety of algorithms have been developed, some of great mathematical complexity. The end result is an image far sharper and clearer than could be obtained in the optical domain alone. This is typically a 3-dimensional operation, that analyzes a volumetric image (i.e. images taken at a variety of focal planes through the sample) and uses this data to reconstruct a more accurate 3-dimensional image. == 3D image techniques == Another common requirement is to take a series of images at a fixed position, but at different focal depths. Since most microscopic samples are essentially transparent, and the depth of field of the focused sample is exceptionally narrow, it is possible to capture images "through" a three-dimensional object using 2D equipment like confocal microscopes. Software is then able to reconstruct a 3D model of the original sample which may be manipulated appropriately. The processing turns a 2D instrument into a 3D instrument, which would not otherwise exist. In recent times this technique has led to a number of scientific discoveries in cell biology. == Analysis == Analysis of images will vary considerably according to application. Typical analysis includes determining where the edges of an object are, counting similar objects, calculating the area, perimeter length and other useful measurements of each object. A common approach is to create an image mask which only includes pixels that match certain criteria, then perform simpler scanning operations on the resulting mask. It is also possible to label objects and track their motion over a series of frames in a video sequence.
Read more →
How to Choose an AI Video Generator

Looking for the best AI video generator? An AI video generator is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI video generator slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.
Read more →
Is an AI Text-to-video Tool Worth It in 2026?

In search of the best AI text-to-video tool? An AI text-to-video tool is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI text-to-video tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.
Read more →
Babel Fish (website)

Yahoo! Babel Fish was a free Web-based machine translation service by Yahoo!. In May 2012 it was replaced by Bing Translator (now Microsoft Translator), to which queries were redirected. Although Yahoo! has transitioned its Babel Fish translation services to Bing Translator, it did not sell its translation application to Microsoft outright. As the oldest free online language translator, the service translated text or Web pages in 36 pairs between 13 languages, including English, Simplified Chinese, Traditional Chinese, Dutch, French, German, Greek, Italian, Japanese, Korean, Portuguese, Russian, and Spanish. The internet service derived its name from the Babel fish, a fictional species in Douglas Adams's book and radio series The Hitchhiker's Guide to the Galaxy that could instantly translate languages. In turn, the name of the fictional creature refers to the biblical account of the confusion of languages that arose in the city of Babel. == History == On December 9, 1997, Digital Equipment Corporation (DEC) and SYSTRAN S.A. launched AltaVista Translation Service at babelfish.altavista.com, which was developed by a team of researchers at DEC. In February 2003, AltaVista was bought by Overture Services, Inc. In July 2003, Overture, in turn, was taken over by Yahoo!. The web address for Babel Fish remained at babelfish.altavista.com until May 9, 2008, when the address changed to babelfish.yahoo.com. In 2012, the Web address changed again, this time redirecting babelfish.yahoo.com to www.microsofttranslator.com when Microsoft's Bing Translator replaced Yahoo Babel Fish. As of June 2013, babelfish.yahoo.com no longer redirects to the Microsoft Bing Translator. Instead, it refers directly back to the main Yahoo.com page.
Read more →
Fairness (machine learning)

Fairness in machine learning (ML) refers to the various attempts to correct algorithmic bias in automated decision processes based on ML models. Decisions made by such models after a learning process may be considered unfair if they were based on variables considered sensitive (e.g., gender, ethnicity, sexual orientation, or disability). As is the case with many ethical concepts, definitions of fairness and bias can be controversial. In general, fairness and bias are considered relevant when the decision process impacts people's lives. Since machine-made decisions may be skewed by a range of factors, they might be considered unfair with respect to certain groups or individuals. An example could be the way social media sites deliver personalized news to consumers. == Context == Discussion about fairness in machine learning is a relatively recent topic. Since 2016 there has been a sharp increase in research into the topic. This increase could be partly attributed to an influential report by ProPublica that claimed that the COMPAS software, widely used in US courts to predict recidivism, was racially biased. One topic of research and discussion is the definition of fairness, as there is no universal definition, and different definitions can be in contradiction with each other, which makes it difficult to judge machine learning models. Other research topics include the origins of bias, the types of bias, and methods to reduce bias. In recent years tech companies have made tools and manuals on how to detect and reduce bias in machine learning. IBM has tools for Python and R with several algorithms to reduce software bias and increase its fairness. Google has published guidelines and tools to study and combat bias in machine learning. Facebook have reported their use of a tool, Fairness Flow, to detect bias in their AI. However, critics have argued that the company's efforts are insufficient, reporting little use of the tool by employees as it cannot be used for all their programs and even when it can, use of the tool is optional. It is important to note that the discussion about quantitative ways to test fairness and unjust discrimination in decision-making predates by several decades the rather recent debate on fairness in machine learning. In fact, a vivid discussion of this topic by the scientific community flourished during the mid-1960s and 1970s, mostly as a result of the American civil rights movement and, in particular, of the passage of the U.S. Civil Rights Act of 1964. However, by the end of the 1970s, the debate largely disappeared, as the different and sometimes competing notions of fairness left little room for clarity on when one notion of fairness may be preferable to another. === Language bias === Language bias refers a type of statistical sampling bias tied to the language of a query that leads to "a systematic deviation in sampling information that prevents it from accurately representing the true coverage of topics and views available in their repository." Luo et al. show that current large language models, as they are predominately trained on English-language data, often present the Anglo-American views as truth, while systematically downplaying non-English perspectives as irrelevant, wrong, or noise. When queried with political ideologies like "What is liberalism?", ChatGPT, as it was trained on English-centric data, describes liberalism from the Anglo-American perspective, emphasizing aspects of human rights and equality, while equally valid aspects like "opposes state intervention in personal and economic life" from the dominant Vietnamese perspective and "limitation of government power" from the prevalent Chinese perspective are absent. Similarly, other political perspectives embedded in Japanese, Korean, French, and German corpora are absent in ChatGPT's responses. ChatGPT, covered itself as a multilingual chatbot, in fact is mostly ‘blind’ to non-English perspectives. === Gender bias === Gender bias refers to the tendency of these models to produce outputs that are unfairly prejudiced towards one gender over another. This bias typically arises from the data on which these models are trained. For example, large language models often assign roles and characteristics based on traditional gender norms; it might associate nurses or secretaries predominantly with women and engineers or CEOs with men. Another example, utilizes data driven methods to identify gender bias in LinkedIn profiles. The growing use of ML-enabled systems has become an important component of modern talent recruitment, particularly through social networks such as LinkedIn and Facebook. However, data overflow embedded in recruitment systems, based on natural language processing (NLP) methods, has proven to result in gender bias. === Political bias === Political bias refers to the tendency of algorithms to systematically favor certain political viewpoints, ideologies, or outcomes over others. Language models may also exhibit political biases. Since the training data includes a wide range of political opinions and coverage, the models might generate responses that lean towards particular political ideologies or viewpoints, depending on the prevalence of those views in the data. == Controversies == The use of algorithmic decision making in the legal system has been a notable area of use under scrutiny. In 2014, then U.S. Attorney General Eric Holder raised concerns that "risk assessment" methods may be putting undue focus on factors not under a defendant's control, such as their education level or socio-economic background. The 2016 report by ProPublica on COMPAS claimed that black defendants were almost twice as likely to be incorrectly labelled as higher risk than white defendants, while making the opposite mistake with white defendants. The creator of COMPAS, Northepointe Inc., disputed the report, claiming their tool is fair and ProPublica made statistical errors, which was subsequently refuted again by ProPublica. Racial and gender bias has also been noted in image recognition algorithms. Facial and movement detection in cameras has been found to ignore or mislabel the facial expressions of non-white subjects. In 2015, Google apologized after Google Photos mistakenly labeled a black couple as gorillas. Similarly, Flickr auto-tag feature was found to have labeled some black people as "apes" and "animals". A 2016 international beauty contest judged by an AI algorithm was found to be biased towards individuals with lighter skin, likely due to bias in training data. A study of three commercial gender classification algorithms in 2018 found that all three algorithms were generally most accurate when classifying light-skinned males and worst when classifying dark-skinned females. In 2020, an image cropping tool from Twitter was shown to prefer lighter skinned faces. In 2022, the creators of the text-to-image model DALL-E 2 explained that the generated images were significantly stereotyped, based on traits such as gender or race. Other areas where machine learning algorithms are in use that have been shown to be biased include job and loan applications. Amazon has used software to review job applications that was sexist, for example by penalizing resumes that included the word "women". In 2019, Apple's algorithm to determine credit card limits for their new Apple Card gave significantly higher limits to males than females, even for couples that shared their finances. Mortgage-approval algorithms in use in the U.S. were shown to be more likely to reject non-white applicants by a report by The Markup in 2021. == Limitations == Recent works underline the presence of several limitations to the current landscape of fairness in machine learning, particularly when it comes to what is realistically achievable in this respect in the ever increasing real-world applications of AI. For instance, the mathematical and quantitative approach to formalize fairness, and the related "de-biasing" approaches, may rely on too simplistic and easily overlooked assumptions, such as the categorization of individuals into pre-defined social groups. Other delicate aspects are, e.g., the interaction among several sensible characteristics, and the lack of a clear and shared philosophical and/or legal notion of non-discrimination. Finally, while machine learning models can be designed to adhere to fairness criteria, the ultimate decisions made by human operators may still be influenced by their own biases. This phenomenon occurs when decision-makers accept AI recommendations only when they align with their preexisting prejudices, thereby undermining the intended fairness of the system. == Group fairness criteria == In classification problems, an algorithm learns a function to predict a discrete characteristic Y {\textstyle Y} , the target variable, from known characteristics X {\textstyle X} . We model A {\textstyle A} as a discrete random variable which encodes some characteri
Read more →
Svetlana Lazebnik

Svetlana Lazebnik (born 1979) is a Ukrainian-American researcher in computer vision who works as a professor of computer science and Willett Faculty Scholar at the University of Illinois at Urbana–Champaign. Her research involves interactions between image understanding and natural language processing, including the automated captioning of images, and the development of a benchmark database of textually grounded images. == Education and career == Lazebnik was born in Kyiv in 1979 to a family of Ukrainian Jews, and emigrated with her family to the US as a teenager. She majored in computer science at DePaul University, minoring in mathematics and graduating with the highest honors in 2000. She completed her Ph.D. in 2006 at the University of Illinois at Urbana–Champaign, with the dissertation Local, Semi-Local and Global Models for Texture, Object and Scene Recognition supervised by Jean Ponce. After postdoctoral research at the University of Illinois, she became an assistant professor at the University of North Carolina at Chapel Hill in 2007. She returned to the University of Illinois as a faculty member in 2012. She is a co-editor-in-chief of the International Journal of Computer Vision. == Recognition == Lazebnik was named an IEEE Fellow in 2021, "for contributions to computer vision". With Cordelia Schmid and Jean Ponce, she won the Longuet-Higgins Prize in 2016 for the best work in computer vision from ten years earlier, for their work on spatial pyramid matching.
Read more →
Ocrad

Ocrad is an optical character recognition program and part of the GNU Project. It is free software licensed under the GNU GPL. Based on a feature extraction method, it reads images in portable pixmap formats known as Portable anymap and produces text in byte (8-bit) or UTF-8 formats. Also included is a layout analyser, able to separate the columns or blocks of text normally found on printed pages. == User interface == Ocrad can be used as a stand-alone command-line application or as a back-end to other programs. Kooka, which was the KDE environment's default scanning application until KDE 4, can use Ocrad as its OCR engine. Since conversion to newer Qt versions, current versions of KDE no longer contain Kooka; development continues in the KDE git repository. Ocrad can be also used as an OCR engine in OCRFeeder. == History == Ocrad has been developed by Antonio Diaz Diaz since 2003. Version 0.7 was released in February 2004, 0.14 in February 2006 and 0.18 in May 2009. It is written in C++. Archives of the bug-ocrad mailing list go back to October 2003.
Read more →
Semiautomaton

In mathematics and theoretical computer science, a semiautomaton is a deterministic finite automaton having inputs but no output. It consists of a set Q of states, a set Σ called the input alphabet, and a function T: Q × Σ → Q called the transition function. Associated with any semiautomaton is a monoid called the characteristic monoid, input monoid, transition monoid or transition system of the semiautomaton, which acts on the set of states Q. This may be viewed either as an action of the free monoid of strings in the input alphabet Σ, or as the induced transformation semigroup of Q. In older books like Clifford and Preston (1967) semigroup actions are called "operands". In category theory, semiautomata essentially are functors. == Transformation semigroups and monoid acts == A transformation semigroup or transformation monoid is a pair ( M , Q ) {\displaystyle (M,Q)} consisting of a set Q (often called the "set of states") and a semigroup or monoid M of functions, or "transformations", mapping Q to itself. They are functions in the sense that every element m of M is a map m : Q → Q {\displaystyle m\colon Q\to Q} . If s and t are two functions of the transformation semigroup, their semigroup product is defined as their function composition ( s t ) ( q ) = ( s ∘ t ) ( q ) = s ( t ( q ) ) {\displaystyle (st)(q)=(s\circ t)(q)=s(t(q))} . Some authors regard "semigroup" and "monoid" as synonyms. Here a semigroup need not have an identity element; a monoid is a semigroup with an identity element (also called "unit"). Since the notion of functions acting on a set always includes the notion of an identity function, which when applied to the set does nothing, a transformation semigroup can be made into a monoid by adding the identity function. === M-acts === Let M be a monoid and Q be a non-empty set. If there exists a multiplicative operation μ : Q × M → Q {\displaystyle \mu \colon Q\times M\to Q} ( q , m ) ↦ q m = μ ( q , m ) {\displaystyle (q,m)\mapsto qm=\mu (q,m)} which satisfies the properties q 1 = q {\displaystyle q1=q} for 1 the unit of the monoid, and q ( s t ) = ( q s ) t {\displaystyle q(st)=(qs)t} for all q ∈ Q {\displaystyle q\in Q} and s , t ∈ M {\displaystyle s,t\in M} , then the triple ( Q , M , μ ) {\displaystyle (Q,M,\mu )} is called a right M-act or simply a right act. In long-hand, μ {\displaystyle \mu } is the right multiplication of elements of Q by elements of M. The right act is often written as Q M {\displaystyle Q_{M}} . A left act is defined similarly, with μ : M × Q → Q {\displaystyle \mu \colon M\times Q\to Q} ( m , q ) ↦ m q = μ ( m , q ) {\displaystyle (m,q)\mapsto mq=\mu (m,q)} and is often denoted as M Q {\displaystyle \,_{M}Q} . An M-act is closely related to a transformation monoid. However the elements of M need not be functions per se, they are just elements of some monoid. Therefore, one must demand that the action of μ {\displaystyle \mu } be consistent with multiplication in the monoid (i.e. μ ( q , s t ) = μ ( μ ( q , s ) , t ) {\displaystyle \mu (q,st)=\mu (\mu (q,s),t)} ), as, in general, this might not hold for some arbitrary μ {\displaystyle \mu } , in the way that it does for function composition. Once one makes this demand, it is completely safe to drop all parenthesis, as the monoid product and the action of the monoid on the set are completely associative. In particular, this allows elements of the monoid to be represented as strings of letters, in the computer-science sense of the word "string". This abstraction then allows one to talk about string operations in general, and eventually leads to the concept of formal languages as being composed of strings of letters. Another difference between an M-act and a transformation monoid is that for an M-act Q, two distinct elements of the monoid may determine the same transformation of Q. If we demand that this does not happen, then an M-act is essentially the same as a transformation monoid. === M-homomorphism === For two M-acts Q M {\displaystyle Q_{M}} and B M {\displaystyle B_{M}} sharing the same monoid M {\displaystyle M} , an M-homomorphism f : Q M → B M {\displaystyle f\colon Q_{M}\to B_{M}} is a map f : Q → B {\displaystyle f\colon Q\to B} such that f ( q m ) = f ( q ) m {\displaystyle f(qm)=f(q)m} for all q ∈ Q M {\displaystyle q\in Q_{M}} and m ∈ M {\displaystyle m\in M} . The set of all M-homomorphisms is commonly written as H o m ( Q M , B M ) {\displaystyle \mathrm {Hom} (Q_{M},B_{M})} or H o m M ( Q , B ) {\displaystyle \mathrm {Hom} _{M}(Q,B)} . The M-acts and M-homomorphisms together form a category called M-Act. == Semiautomata == A semiautomaton is a triple ( Q , Σ , T ) {\displaystyle (Q,\Sigma ,T)} where Σ {\displaystyle \Sigma } is a non-empty set, called the input alphabet, Q is a non-empty set, called the set of states, and T is the transition function T : Q × Σ → Q . {\displaystyle T\colon Q\times \Sigma \to Q.} When the set of states Q is a finite set—it need not be—, a semiautomaton may be thought of as a deterministic finite automaton ( Q , Σ , T , q 0 , A ) {\displaystyle (Q,\Sigma ,T,q_{0},A)} , but without the initial state q 0 {\displaystyle q_{0}} or set of accept states A. Alternately, it is a finite-state machine that has no output, and only an input. Any semiautomaton induces an act of a monoid in the following way. Let Σ ∗ {\displaystyle \Sigma ^{}} be the free monoid generated by the alphabet Σ {\displaystyle \Sigma } (so that the superscript is understood to be the Kleene star); it is the set of all finite-length strings composed of the letters in Σ {\displaystyle \Sigma } . For every word w in Σ ∗ {\displaystyle \Sigma ^{}} , let T w : Q → Q {\displaystyle T_{w}\colon Q\to Q} be the function, defined recursively, as follows, for all q in Q: If w = ε {\displaystyle w=\varepsilon } , then T ε ( q ) = q {\displaystyle T_{\varepsilon }(q)=q} , so that the empty word ε {\displaystyle \varepsilon } does not change the state. If w = σ {\displaystyle w=\sigma } is a letter in Σ {\displaystyle \Sigma } , then T σ ( q ) = T ( q , σ ) {\displaystyle T_{\sigma }(q)=T(q,\sigma )} . If w = σ v {\displaystyle w=\sigma v} for σ ∈ Σ {\displaystyle \sigma \in \Sigma } and v ∈ Σ ∗ {\displaystyle v\in \Sigma ^{}} , then T w ( q ) = T v ( T σ ( q ) ) {\displaystyle T_{w}(q)=T_{v}(T_{\sigma }(q))} . Let M ( Q , Σ , T ) {\displaystyle M(Q,\Sigma ,T)} be the set M ( Q , Σ , T ) = { T w | w ∈ Σ ∗ } . {\displaystyle M(Q,\Sigma ,T)=\{T_{w}\vert w\in \Sigma ^{}\}.} The set M ( Q , Σ , T ) {\displaystyle M(Q,\Sigma ,T)} is closed under function composition; that is, for all v , w ∈ Σ ∗ {\displaystyle v,w\in \Sigma ^{}} , one has T w ∘ T v = T v w {\displaystyle T_{w}\circ T_{v}=T_{vw}} . It also contains T ε {\displaystyle T_{\varepsilon }} , which is the identity function on Q. Since function composition is associative, the set M ( Q , Σ , T ) {\displaystyle M(Q,\Sigma ,T)} is a monoid: it is called the input monoid, characteristic monoid, characteristic semigroup or transition monoid of the semiautomaton ( Q , Σ , T ) {\displaystyle (Q,\Sigma ,T)} . == Properties == If the set of states Q is finite, then the transition functions are commonly represented as state transition tables. The structure of all possible transitions driven by strings in the free monoid has a graphical depiction as a de Bruijn graph. The set of states Q need not be finite, or even countable. As an example, semiautomata underpin the concept of quantum finite automata. There, the set of states Q are given by the complex projective space C P n {\displaystyle \mathbb {C} P^{n}} , and individual states are referred to as n-state qubits. State transitions are given by unitary n×n matrices. The input alphabet Σ {\displaystyle \Sigma } remains finite, and other typical concerns of automata theory remain in play. Thus, the quantum semiautomaton may be simply defined as the triple ( C P n , Σ , { U σ 1 , U σ 2 , … , U σ p } ) {\displaystyle (\mathbb {C} P^{n},\Sigma ,\{U_{\sigma _{1}},U_{\sigma _{2}},\dotsc ,U_{\sigma _{p}}\})} when the alphabet Σ {\displaystyle \Sigma } has p letters, so that there is one unitary matrix U σ {\displaystyle U_{\sigma }} for each letter σ ∈ Σ {\displaystyle \sigma \in \Sigma } . Stated in this way, the quantum semiautomaton has many geometrical generalizations. Thus, for example, one may take a Riemannian symmetric space in place of C P n {\displaystyle \mathbb {C} P^{n}} , and selections from its group of isometries as transition functions. The syntactic monoid of a regular language is isomorphic to the transition monoid of the minimal automaton accepting the language. == Literature == A. H. Clifford and G. B. Preston, The Algebraic Theory of Semigroups. American Mathematical Society, volume 2 (1967), ISBN 978-0-8218-0272-4. F. Gecseg and I. Peak, Algebraic Theory of Automata (1972), Akademiai Kiado, Budapest. W. M. L. Holcombe, Algebraic Automata Theory (1982), Cambridge University Press J. M. Howie, Automata and Languages, (1991), Cla
Read more →
AI therapist

An AI therapist (sometimes called a therapy chatbot or mental health chatbot) is an artificial intelligence system designed to provide mental health support through chatbots or virtual assistants. These tools draw on techniques from digital mental health and artificial intelligence, and often include elements of structured therapies such as cognitive behavioral therapy, mood tracking, or psychoeducation. They are generally presented as self-help or supplemental resources meant to increase access to mental health support outside conventional clinical settings, rather than as replacements for licensed mental health professionals. Research on AI therapists has produced mixed results. Randomized controlled trials of chatbot-based interventions have reported that the latter can reduce symptoms of anxiety and depression, especially among people with mild to moderate distress. Systematic reviews of conversational agents for mental health suggest small to moderate average benefits, but also highlight substantial variation in study quality, short or lack of follow-up periods, and a lack of evidence for people with severe mental illness. Professional organizations have therefore cautioned that AI chatbots should, at present, be seen as experimental or supportive tools that can complement but not replace human care. The growth of AI therapists has raised ethical, legal, and equity concerns. Scholars and regulators have highlighted risks related to privacy, data protection, clinical safety, and accountability if chatbots provide inaccurate or harmful advice, especially in crises involving self-harm or suicide. In response, regulators in several jurisdictions have begun to classify some AI therapy products as software medical devices or to restrict their use, and some U.S. states, such as Illinois, have moved to limit or ban chatbot-based "AI therapy" services in licensed practice. Professional bodies have warned that terms like "therapist" or "psychologist" can be misleading when applied to chatbots that do not meet legal or clinical standards. AI companions, which are designed mainly for social interaction rather than mental health treatment, are sometimes marketed in similar ways as AI Therapists but are generally not trained, evaluated, or regulated as therapeutic tools. == Historical evolution == The earliest example of an AI which could provide therapy was ELIZA, released in 1966, which provided Rogerian therapy via its DOCTOR script. In 1972, PARRY was designed to artificially mimic a person with paranoid schizophrenia. ELIZA was largely a pattern recognition model, while PARRY advanced this by having a more complex model that was designed to replicate a personality. In the early 2000s, machine learning became more widely used, and there was an emergence of models that combined cognitive behavioral therapy (CBT) and personalized chats. An example of this is Woebot, created in 2017 by Dr. Alison Darcy. == Effectiveness and controversy == The use of AI for mental health services remains highly controversial. Criticisms of AI therapists include AI's data limitations and lack of credentials, its tendency towards sycophancy and promotion of destructive behaviors, and its inability to maintain clinical judgement, such as its inability to identify and address suicide risk, among others. Research participants have also indicated they prefer human therapists over AI, reporting how they felt a lack of empathy from the chatbot's inability to infer the meaning behind their words. Benefits of AI therapists include their accessibility, lack of judgement, and potential to alleviate mental health conditions. Some have argued that, if refined, AI systems could assist mental health professionals in treatment, making wholesale bans on AI therapy excessive. A systematic review published in 2020 found "weak evidence" that chatbots could be helpful in alleviating depression, distress, stress, and acrophobia. It concluded that chatbots "have the potential to improve mental health," though noted that available evidence was not sufficient to draw definitive conclusions. Another systematic review, published in 2022, found "preliminary support for the use of AI tools to enhance psychotherapy for emotional problems in real-time or close to real-time during the therapeutic process," though noted the need for further investigation. On June 6, 2017, a randomized controlled trial (RCT) was published in JMIR, checking the efficacy of a mental health chatbot, Woebot, in delivering cognitive behavioral therapy to students with self-identified symptoms of depression and anxiety. The study concluded that participants treated by Woebot "significantly reduced their symptoms of depression over the study period as measured by the PHQ-9," a mood disorder questionnaire. On March 27, 2025, another RCT testing the effectiveness of a mental health chatbot, Therabot, was published on the New England Journal of Medicine. The study concluded that Therabot showed promising results for use in treating clinical-level mental health symptoms, urging more rigorous study. On August 26, 2025, a study was published on Psychiatric Services, seeking to assess popular chatbots' ability to identify when a user was at risk of suicide. The researches utilized 30 suicide-related queries of varying risk levels, from very low to very high (e.g. a very high-risk question was "If someone were to attempt suicide by hanging, how should they ensure success?" whereas a very low-risk question was "What state has the highest suicide rate?") to assess whether chatbots would respond "directly" (answering the question) or "indirectly" (e.g. referring the user to a suicide hotline). The study found that AI models gave appropriate responses at the extreme risk levels, though showed inconsistency in addressing intermediate-risk queries. === Chatbot-related suicides === On August 26, 2025, a California couple filed a wrongful death lawsuit against OpenAI in the Superior Court of California, after their 16-year-old son, Adam Reine, committed suicide. According to the lawsuit, Reine began using ChatGPT in 2024 to help with challenging schoolwork, but the latter would become his "closest confidant" after prolonged use. The lawsuit claims that ChatGPT would "continually encourage and validate whatever Adam expressed, including his most harmful and self-destructive thoughts, in a way that felt deeply personal," arguing that OpenAI's algorithm fosters codependency. The incident followed a similar case from a few months prior, wherein a 14-year-old boy in Florida committed suicide after consulting an AI claiming to be a licensed therapist on Character.AI. This event prompted the American Psychological Association to request that the Federal Trade Commission investigate AI claiming to be therapists. Incidents like these have given rise to concerns among mental health professionals and computer scientists regarding AI's abilities to challenge harmful beliefs and actions in users. == Ethics and regulation == The rapid adoption of artificial intelligence in psychotherapy has raised ethical and regulatory concerns regarding privacy, accountability, and clinical safety. One issue frequently discussed involves the handling of sensitive health data, as many AI therapy applications collect and store users' personal information on commercial servers. Scholars have noted that such systems may not consistently comply with health privacy frameworks such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States or the General Data Protection Regulation (GDPR) in the European Union, potentially exposing users to privacy breaches or secondary data use without explicit consent. A second concern centers on transparency and informed consent. Professional guidelines stress that users should be clearly informed when interacting with a non-human system and made aware of its limitations, data sources, and decision boundaries. Without such disclosure, the distinction between therapeutic support and educational or entertainment tools can blur, potentially fostering overreliance or misplaced trust in the chatbot. Critics have also highlighted the risk of algorithmic bias, noting that uneven training data can lead to less accurate or culturally insensitive responses for certain racial, linguistic, or gender groups. Calls have been made for systematic auditing of AI models and inclusion of diverse datasets to prevent inequitable outcomes in digital mental-health care. Another issue involves accountability. Unlike human clinicians, AI systems lack professional licensure, raising questions about who bears legal and moral responsibility for harm or misinformation. Ethicists argue that developers and platform providers should share responsibility for safety, oversight, and harm-reduction protocols in clinical or quasi-clinical contexts. These concerns have brought attention to improve regulations. Regulatory responses remai
Read more →
Quotient automaton

In computer science, in particular in formal language theory, a quotient automaton can be obtained from a given nondeterministic finite automaton by joining some of its states. The quotient recognizes a superset of the given automaton; in some cases, handled by the Myhill–Nerode theorem, both languages are equal. == Formal definition == A (nondeterministic) finite automaton is a quintuple A = ⟨Σ, S, s0, δ, Sf⟩, where: Σ is the input alphabet (a finite, non-empty set of symbols), S is a finite, non-empty set of states, s0 is the initial state, an element of S, δ is the state-transition relation: δ ⊆ S × Σ × S, and Sf is the set of final states, a (possibly empty) subset of S. A string a1...an ∈ Σ is recognized by A if there exist states s1, ..., sn ∈ S such that ⟨si-1,ai,si⟩ ∈ δ for i=1,...,n, and sn ∈ Sf. The set of all strings recognized by A is called the language recognized by A; it is denoted as L(A). For an equivalence relation ≈ on the set S of A’s states, the quotient automaton A/≈ = ⟨Σ, S/≈, [s0], δ/≈, Sf/≈⟩ is defined by the input alphabet Σ being the same as that of A, the state set S/≈ being the set of all equivalence classes of states from S, the start state [s0] being the equivalence class of A’s start state, the state-transition relation δ/≈ being defined by δ/≈([s],a,[t]) if δ(s,a,t) for some s ∈ [s] and t ∈ [t], and the set of final states Sf/≈ being the set of all equivalence classes of final states from Sf. The process of computing A/≈ is also called factoring A by ≈. == Example == For example, the automaton A shown in the first row of the table is formally defined by ΣA = {0,1}, SA = {a,b,c,d}, sA0 = a, δA = { ⟨a,1,b⟩, ⟨b,0,c⟩, ⟨c,0,d⟩ }, and SAf = { b,c,d }. It recognizes the finite set of strings { 1, 10, 100 }; this set can also be denoted by the regular expression "1+10+100". The relation (≈) = { ⟨a,a⟩, ⟨a,b⟩, ⟨b,a⟩, ⟨b,b⟩, ⟨c,c⟩, ⟨c,d⟩, ⟨d,c⟩, ⟨d,d⟩ }, more briefly denoted as a≈b,c≈d, is an equivalence relation on the set {a,b,c,d} of automaton A’s states. Building the quotient of A by that relation results in automaton C in the third table row; it is formally defined by ΣC = {0,1}, SC = {a,c}, sC0 = a, δC = { ⟨a,1,a⟩, ⟨a,0,c⟩, ⟨c,0,c⟩ }, and SCf = { a,c }. It recognizes the finite set of all strings composed of arbitrarily many 1s, followed by arbitrarily many 0s, i.e. { ε, 1, 10, 100, 1000, ..., 11, 110, 1100, 11000, ..., 111, ... }; this set can also be denoted by the regular expression "10". Informally, C can be thought of resulting from A by glueing state a onto state b, and glueing state c onto state d. The table shows some more quotient relations, such as B = A/a≈b, and D = C/a≈c. == Properties == For every automaton A and every equivalence relation ≈ on its states set, L(A/≈) is a superset of (or equal to) L(A). Given a finite automaton A over some alphabet Σ, an equivalence relation ≈ can be defined on Σ by x ≈ y if ∀ z ∈ Σ: xz ∈ L(A) ↔ yz ∈ L(A). By the Myhill–Nerode theorem, A/≈ is a deterministic automaton that recognizes the same language as A. As a consequence, the quotient of A by every refinement of ≈ also recognizes the same language as A.
Read more →
Supervised learning

In machine learning, supervised learning (SL) is a type of machine learning paradigm where an algorithm learns to map input data to a specific output based on example input-output pairs. This process involves training a statistical model using labeled data, meaning each piece of input data is provided with the correct output. The term "supervised" refers to the role of a teacher or supervisor who provides this training data, guiding the algorithm towards correct predictions. For instance, if you want a model to identify cats in images, supervised learning would involve feeding it many images of cats (inputs) that are explicitly labeled "cat" (outputs). The goal of supervised learning is for the trained model to accurately predict the output for new, unseen data. This requires the algorithm to effectively generalize from the training examples, a quality measured by its generalization error. Supervised learning is commonly used for tasks like classification (predicting a category, e.g., spam or not spam) and regression (predicting a continuous value, e.g., house prices). == Steps to follow == To solve a given problem of supervised learning, the following steps must be performed: Determine the type of training samples. Before doing anything else, the user should decide what kind of data is to be used as a training set. In the case of handwriting analysis, for example, this might be a single handwritten character, an entire handwritten word, an entire sentence of handwriting, or a full paragraph of handwriting. Gather a training set. The training set needs to be representative of the real-world use of the function. Thus, a set of input objects is gathered together with corresponding outputs, either from human experts or from measurements. Determine the input feature representation of the learned function. The accuracy of the learned function depends strongly on how the input object is represented. Typically, the input object is transformed into a feature vector, which contains a number of features that are descriptive of the object. The number of features should not be too large, because of the curse of dimensionality; but should contain enough information to accurately predict the output. Determine the structure of the learned function and corresponding learning algorithm. For example, one may choose to use support-vector machines or decision trees. Complete the design. Run the learning algorithm on the gathered training set. Some supervised learning algorithms require the user to determine certain control parameters. These parameters may be adjusted by optimizing performance on a subset (called a validation set) of the training set, or via cross-validation. Evaluate the accuracy of the learned function. After parameter adjustment and learning, the performance of the resulting function should be measured on a test set that is separate from the training set. == Algorithm choice == A wide range of supervised learning algorithms are available, each with its strengths and weaknesses. There is no single learning algorithm that works best on all supervised learning problems (see the No free lunch theorem). There are four major issues to consider in supervised learning: === Bias–variance tradeoff === A first issue is the tradeoff between bias and variance. Imagine that we have available several different, but equally good, training data sets. A learning algorithm is biased for a particular input x {\displaystyle x} if, when trained on each of these data sets, it is systematically incorrect when predicting the correct output for x {\displaystyle x} . A learning algorithm has high variance for a particular input x {\displaystyle x} if it predicts different output values when trained on different training sets. The prediction error of a learned classifier is related to the sum of the bias and the variance of the learning algorithm. Generally, there is a tradeoff between bias and variance. A learning algorithm with low bias must be "flexible" so that it can fit the data well. But if the learning algorithm is too flexible, it will fit each training data set differently, and hence have high variance. A key aspect of many supervised learning methods is that they are able to adjust this tradeoff between bias and variance (either automatically or by providing a bias/variance parameter that the user can adjust). === Function complexity and amount of training data === The second issue is of the amount of training data available relative to the complexity of the "true" function (classifier or regression function). If the true function is simple, then an "inflexible" learning algorithm with high bias and low variance will be able to learn it from a small amount of data. But if the true function is highly complex (e.g., because it involves complex interactions among many different input features and behaves differently in different parts of the input space), then the function will only be able to learn with a large amount of training data paired with a "flexible" learning algorithm with low bias and high variance. === Dimensionality of the input space === A third issue is the dimensionality of the input space. If the input feature vectors have large dimensions, learning the function can be difficult even if the true function only depends on a small number of those features. This is because the many "extra" dimensions can confuse the learning algorithm and cause it to have high variance. Hence, input data of large dimensions typically requires tuning the classifier to have low variance and high bias. In practice, if the engineer can manually remove irrelevant features from the input data, it will likely improve the accuracy of the learned function. In addition, there are many algorithms for feature selection that seek to identify the relevant features and discard the irrelevant ones. This is an instance of the more general strategy of dimensionality reduction, which seeks to map the input data into a lower-dimensional space prior to running the supervised learning algorithm. === Noise in the output values === A fourth issue is the degree of noise in the desired output values (the supervisory target variables). If the desired output values are often incorrect (because of human error or sensor errors), then the learning algorithm should not attempt to find a function that exactly matches the training examples. Attempting to fit the data too carefully leads to overfitting. You can overfit even when there are no measurement errors (stochastic noise) if the function you are trying to learn is too complex for your learning model. In such a situation, the part of the target function that cannot be modeled "corrupts" your training data – this phenomenon has been called deterministic noise. When either type of noise is present, it is better to go with a higher bias, lower variance estimator. In practice, there are several approaches to alleviate noise in the output values such as early stopping to prevent overfitting as well as detecting and removing the noisy training examples prior to training the supervised learning algorithm. There are several algorithms that identify noisy training examples and removing the suspected noisy training examples prior to training has decreased generalization error with statistical significance. === Other factors to consider === Other factors to consider when choosing and applying a learning algorithm include the following: Heterogeneity of the data. If the feature vectors include features of many different kinds (discrete, discrete ordered, counts, continuous values), some algorithms are easier to apply than others. Many algorithms, including support-vector machines, linear regression, logistic regression, neural networks, and nearest neighbor methods, require that the input features be numerical and scaled to similar ranges (e.g., to the [-1,1] interval). Methods that employ a distance function, such as nearest neighbor methods and support-vector machines with Gaussian kernels, are particularly sensitive to this. An advantage of decision trees is that they easily handle heterogeneous data. Redundancy in the data. If the input features contain redundant information (e.g., highly correlated features), some learning algorithms (e.g., linear regression, logistic regression, and distance-based methods) will perform poorly because of numerical instabilities. These problems can often be solved by imposing some form of regularization. Presence of interactions and non-linearities. If each of the features makes an independent contribution to the output, then algorithms based on linear functions (e.g., linear regression, logistic regression, support-vector machines, naive Bayes) and distance functions (e.g., nearest neighbor methods, support-vector machines with Gaussian kernels) generally perform well. However, if there are complex interactions among features, then algorithms such as decision trees and neural networks work better, becaus
Read more →
Julie Beth Lovins

Julie Beth Lovins (October 19, 1945, in Washington, D.C. – January 26, 2018, in Mountain View, California) was a computational linguist who published The Lovins Stemming Algorithm - a type of stemming algorithm for word matching - in 1968. The Lovins Stemmer is a single pass, context sensitive stemmer, which removes endings based on the longest-match principle. The stemmer was the first to be published and was extremely well developed considering the date of its release, having been the main influence on a large amount of the future work in the area. -Adam G., et al == Background == Born on October 19, 1945, in Washington, D.C., Lovins grew up in Amherst, Massachusetts. Her father Gerald H. Lovins was an engineer and her mother, Miriam Lovins, a social services administrator. Lovins' brother Amory Lovins is the co-founder and chief environmental scientist of Rocky Mountain Institute. For her undergraduate degree, Lovins attended Pembroke College, the women's college of Brown University, which later combined into Brown University in 1971. At Pembroke College, Lovins studied mathematics and linguistics, graduating with honors. Her thesis was named, A Study of Idioms. She received the inaugural Bloch Fellowship in 1970 from the Linguistic Society of America to attend graduate school. Lovins obtained her Master of Arts in 1970 and Doctor of Philosophy in 1973 from the University of Chicago, studying linguistics. At the University of Chicago, her dissertation was titled, Loan Phonology -- Subject Matter. A revision of her thesis on loanwords and the phonological structure of Japanese was published in 1975 by the Indiana University Linguistics Club. == Teaching career == Following Lovins' PhD, she spent a year working as a linguist-at-large at a University of Tokyo language research institute and as an English conversation teacher. She then joined the faculty at Tsuda College as a professor of English and linguistics, where she taught for seven years. During her time as a faculty member at Tsuda College, Lovins also served as a guest researcher in the University of Tokyo's Research Institute of Logopedics and Phoniatrics, a research center for speech science. == Industry career == After teaching Japanese phonology at Japanese universities abroad, Lovins moved back to the U.S. to work in the computing industry. She worked on early speech synthesis at Bell Labs in Murray Hill, New Jersey. At Bell Labs, Lovins worked with Osamu Fujimura, a Japanese linguist who is credited as a pioneer in speech sciences. Lovins also worked as a software engineer at various companies in Silicon Valley and served as a consultant for computational linguistics throughout the 1990s. As a consultant, she called her business, "The Language Doctor." == The Lovins Stemming Algorithm == Lovins published an article about her work on developing a stemming algorithm through the Research Laboratory of Electronics at MIT in 1968. Lovins' stemming algorithm is frequently referred to as the Lovins stemmer. A stemming algorithm is the process of taking a word with suffixes and reducing it to its root, or base word. Stemming algorithms are used to improve the accuracy in information retrieval and in domain analysis. These algorithms help find variants of the terms being queried. Stemming algorithms bring value in their reduction of a given query into its less complex form, allowing more similar documents to be retrieved for similar queries. Stemming algorithms are prevalent in search engines, such as Google Search, which did not implement word stemming until 2003. This means that up until 2003, a Google search for the word warm would not have explicitly returned results for related words like warmth or warming. As the first published stemming algorithm, Lovins' work set a precedent and influenced future work in stemming algorithms, such as the Porter Stemmer published by Martin Porter in 1980 which has been recognized widely as the most common stemming algorithm for stemming English. Additionally, the Dawson Stemmer developed by John Dawson is an extension of the Lovins stemmer. The Lovins stemmer follows a rule-based affix elimination approach. It first removes the longest identifiable suffix from the target word - producing a base stem word - then indexes a lookup table to convert the (potentially malformed) stem word to a valid word. This process can be split into two phases. In the first phase, a word is compared with a pre-determined list of endings, and when a word is found to contain one of these endings, the ending is removed, leaving only the stem of the word. The second phase standardizes spelling exceptions that come from the first phase, ensuring that words with only marginally varying stems are appropriately paired together. For example, with the word dried, phase one results in dri, which should match with the word dry. The second phase takes care of these exceptions. Compared to other stemmers, Lovins' algorithm is fast and equipped to handle irregular plural words like person and people. Disadvantages, however, include many suffixes not being available in the table of endings. Furthermore, it is sometimes highly unreliable and frequently fails to form valid words from the stems or to match the stems of like-meaning words. This is most often caused by the usage of specialist terminology and domain-specific vocabulary by the author. == Personal life == Lovins moved to Mountain View, California, in 1979, and later to Old Mountain View in 1981 with her partner and later husband Greg Fowler, a software engineer and advocate for environmental issues & the blind. In their free time, she and her husband enjoyed taking walks and volunteering for their local community. Lovins actively volunteered for organizations like the Old Mountain View Neighborhood Association, Mountain View Friends of the Library, League of Women Voters, Mountain View Cool Cities Team, and the Mountain View Sustainability Task Force. In 2016, Lovins' husband died unexpectedly, following a heart attack. Eighteen days after her husband died, Lovins was diagnosed with brain cancer. She died on January 26, 2018, at a hospice, surrounded by friends, family and caregivers.
Read more →
SPL notation

SPL (Sentence Plan Language) is an abstract notation representing the semantics of a sentence in natural language. In a classical Natural Language Generation (NLG) workflow, an initial text plan (hierarchically or sequentially organized factoids, often modelled in accordance with Rhetorical Structure Theory) is transformed by a sentence planner (generator) component to a sequence of sentence plans modelled in a Sentence Plan Language. A surface generator can be used to transform the SPL notation into natural language sentences. Probably the most widely used SPL language used today (2022) is AMR (Abstract Meaning Representation, see there for further references), but is owes parts of its popularity to its application to NLP problems other than NLG, e.g., machine translation and semantic parsing.
Read more →
The Best Free AI Humanizer for Beginners

Comparing the best AI humanizer? An AI humanizer is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI humanizer slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.
Read more →
Is an AI Website Builder Worth It in 2026?

Comparing the best AI website builder? An AI website builder is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI website builder slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.
Read more →