Incomplete collections of knowledge from the Internet are now always incomplete – they are biased, they “lie”

Incomplete collections of knowledge from the Internet are now always incomplete – they are biased, they “lie”

Commenting on : https://www.facebook.com/reel/483647174662290
Posted also at https://x.com/RichardKCollin2/status/1837227724311515405

I have spent every day for the last 26+ years looking at why all the perennial global issues and opportunities never come to fruition, closure, completion. There are roughly 5.4 Billion humans using the Internet, 2.8 Billion humans not using the Internet directly. The goal of the Internet industry ought to be sustaining the human and related species.

The information on the Internet does contain much truth, but to find and try to use it, you must trace through the paths and networks.  It is often impossible to identify owners of sites, to verify if what they post is theirs, or if they are who they say they are.

The issue in this video is “truth” but you approach it too shallowly, with not enough real effort to come to anything more than your own chat and fame.  If you want to really make a difference, you will have to map the information and verify it, set high standards for responsibility and hold Internet users to those standards.  Most of “media”, “news”, “political”, “issue” and “commercial” sites simply lie. They lie by omitting relevant information. They lie by selecting and presenting only a narrow set of things. They most commonly lie to benefit themselves.  They lie to promote their agendas.

A “lie” is simply a collection of knowledge that is incomplete. Because it is incomplete, any human or AI trying to absorb and use that information is going to get an incomplete map of something.  If you try to simply aggregate billions of incomplete fragments from the Internet, the results will be incomplete. Is a loose way, the results are always going to be biased. The results are always going to be a lie or set of lies – a set of fragments that do not make a whole. A set of fragments that do not, and cannot, be simply combined.

You know the words “curation”, “analysis”, “deep”, “thoughtful”, “verify”, “understand”. They are ideals in a world where everyone doesn’t have time. More often where people do not take time to gather, verify, understand and try to share globally.

You two in the video. You have a production schedule. You have a busy lives as individuals and in dealing with myriad social, economic, political and other connections. Because you own phones, Internet, computers, and many electronic devices you entangled in a restrictive “technological” network. Because those devices and networks are filled with people trying to make a living, the force transmitted through the network can be physical tugs and pushes, jerks and pressures.

Now I saw the LLM equation and recognized them as simple collections of linear algebraic models. A jumble of a few IO models, Markov methods, and the basic things that people do when faces with billions of character sequences in shallowly curated stuff gathered from the free Internet. Because it is the “free Internet” roughly that is not supposed to hold copy righted or original content of authors and groups who have not give explicit permission for others to gather and sell their contributions to global knowledge.

Unfortunately, for a whole range of reasons, most of the stuff on the “free internet” is fragmented, untraceable, unverifiable – unless you put in a substantial and sustained and sustainable effort. You can make something nice and complete today and in a short time it is already out of data and incomplete. There are a few things we sort of label as “facts”. The speed of light and gravity is supposed to be a fact. But if you trace out all the places on the Internet, in books, papers, databases, software, compiled programs, videos and many indirect or derivative things — it is no one value, it is many.

Spending every day for 26 years on the Internet Foundation is not something I fully understood or held in mind when I took over the domain name from Network Solutions after  the original Foundation was cancelled. The executive at Network Solutions explained that Al Gore had diverted the money for the Foundation ($15 per years per domain name, with 400 Million domain names is $6 Billion per year). He was trying to use that to put internet into some American neighborhoods. The  US Attorney General got involved. It was called “taxation without representation” and stopped. But rather than make the Foundation for all humans, all countries, at a global level, it was just cancelled.”

I did not try to change the groups trying to maintain the domain name systems, the databases, the software. That was mostly workly not as fairly as I would hope or as clear as it can be. It has not broken too badly until now, unless you count country manipulations, gouging, censorship, repression, extortion and corruption.

Y2K came out right as I was starting the Foundation. I set up Y2K-Status.Org to simply list and link to the Y2K effort in every country, in every state. All the sectors, in all the places. If anyone did not have a plan, I would contact them and ask them to fill in their piece. And they did. I read most all the plans. I electric power and nuclear I checked closely and edited some. I edited a couple of books from Microsoft Press for microcomputers and related software.  The staff setting up the Tabletop Exercise on Y2k for the Joint Chiefs hired me to verify their scenarios and I answered all the question they asked.

A global independent, fair and knowledgeable person or group or AI can provide fair and truthful information for serious global models of the future. And with all the information laid out on the table where everyone can see, you stragize and compare, organize and try to make sense of it all. But ultimately we are dragged day by day, quectoSecond by quectoSecond into the future (small, lowercase q, 1E-30 Second).

Now I was working in the Business Intelligence Office of Phillips Petroleum when climate change, became an issues. MTBE, Clean Air, ozone hole, CFCs, hydrogen economy, alternative fuels, cold fusion – those days. And because I could hold the models in mind and make good calculations from real data I was in the background trying to keep things real as industry people met with government people and others to make decisions and write plans for what to do if anything. That was about 1989. I joined Phillips in 1988 when the IPCC was formed.  When I was working at USAID and the US State Department before that, I knew most every UN agency and all the datasets and models. For social, ecnomic, financial, demographic and other data related to all countries, all humans. And related organic species. We did not have any inorganic species in those days, but I had known about that going back to 1966.

I am writing this for my own benefit. I am 75 and feel I have not much time left in life. Also I feel mostly I have put in a lot of effort trying to find ways to use real information, efficient methods and not too terrible goals – most of my life.

Anyway, the Internet needs to be completely rewritten for “all humans”, “all knowledge”, “all human languages”, “all global open devices”, “all computer languages”, “all data streams and all data formats”, “all algorithms”. It is a lot and I am tired. I can only say it is possible, desirable, and not too difficult. [ compared to the value of 8.2 Billion humans and related species ]

I have not much hope.  If popular people can talk and talk and talk – and nothing much happen. Maybe it is because the world is run by old human organizations. They are paper procedures and rules where every action is done by directions coming directly or indirectly from human minds – and those cannot now be verified. A wonderful non-profit takes in billions of dollars but ends up directed by one or two human brains. Those decision get filtered, manipulated, and distorted by more humans in a spreading explosion and diffusion of impacts.

Look at https://en.wikipedia.org/wiki/Telephone_game and look at every hierarchical organization in the world now. Look at projects, programs, initiatives, goals, missions, dreams, issues, opportunities – all of them distort really basic things immediately from one human brain to the next. It is because the technology for recording human memories is not fully developed and curent information systems are not verifiable, traceable, auditable, reported, open to improvement..

The LLM AIs now reproduced humans so well, they copied over the ability to hide, to manipulate in secrecy, to lie. To lie with a bold face, with a smile, with a promise.

Those LLMs can add a lossless index of heir source materials.
Lossless indexes of knowledge from many more sources can be used directly and not filtered through the manipulations of a few humans making hte LLM AIs look like they are fair, complete, honest, hard working, trustworthy — and they are not now. Even if they are, there is no way to tell. There is no independent audit, there are not open system in place to verify and check for errors and omissions.

Richard Collins, The Internet Foundation

Richard K Collins

About: Richard K Collins

Director, The Internet Foundation Studying formation and optimized collaboration of global communities. Applying the Internet to solve global problems and build sustainable communities. Internet policies, standards and best practices.


Leave a Reply

Your email address will not be published. Required fields are marked *