AGI: The Coming AGI Wars: Players and Positioning

Themesis Inc.

zhlédnutí 11 583

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 14. 05. 2024
AGI (artificial general intelligence) is getting closer to near-term reality. We examine three classes of potential players: those based on LLMs (large language models), using transformer-based architectures, typically with reinforcement learning with human feedback (RLHF) to refine capabilities, Yann LeCun's reinforcement learning-based approach (being developed by Meta), and Karl Friston's notion of active inference (being developed by Verses.ai). We show how CORTECONs (COntent-Retentive, TEmporally-CONnected neural networks) can play a role, both as a model within active inference (where the model itself can come to a free energy minimum), as well as a means to connect signal-to-symbolic processing in LLM (signal-based) AIs. We identify funding for the major players, both in terms of recent investments as well as market caps.
Here's the BLOGPOST link:
themesis.com/2024/05/13/emerg...
Here's the link to the Themesis Academy, where you can traverse a link to the actual course offerings on themesis.thinkific.com:
themesis.com/academy/
Please OPT-IN with Themesis on the About page to get word AS SOON as new CZcamss, blogs, and short courses are released:
Opt-In HERE: www.themesis.com/themesis/
Subscribe to the Themesis CZcams channel easily - click this link: / @themesisinc.4045
Věda a technologie

Komentáře • 96

@sidequestsally Před 11 dny ⁺⁸
I'm shocked this channel isn't 10x the followers. Outstanding and challenging
@ili626 Před 12 dny ⁺¹⁰
This is among the best commentary and edifying info I’ve come across on this topic. Thank you!
@Derek-yf6pj Před 9 dny ⁺¹
One of the few channels I found that really understands AI. I love the way you present the ideas.
@thefacethatstares Před 4 dny
Absolute hidden gem. Clear explanations and I see a lot of my vague futurology-style thoughts on the technology being described in precise technical terms - going to be VERY useful for convincing clients I know what I'm talking about :) Tyvm
@valberm Před 13 dny ⁺²³
Underrated channel
@themesisinc.4045 Před 13 dny ⁺¹
Thank you, @Valbern!
@robertrusiecki9033 Před 13 dny ⁺²⁴
What surprises me is that these AI programs are surprisingly simple. One could imagine that they numbered who knows how many millions of lines of program code. In fact, they consist of several hundred, or at most several thousand, lines of code. What's really complicated is the mathematical theories behind these program lines, which contain all the richness of the algebra of linear information geometry. And of course, terabytes of raw data transformed into training data. The programs are simple, the theory behind writing them is not, but the amount of data to train them is absolutely staggering.
@themesisinc.4045 Před 13 dny ⁺¹
Yes, Robert - agreed! - AJM
@GuinessOriginal Před 13 dny ⁺²
This is why it makes me laugh when people state that it’s just “programmed”.
@Dom-zy1qy Před 13 dny ⁺⁵
The engineering needed to support the infrastructure of handling training data, orchestrating thousands of gpus etc, that's a pretty good amount of code.
Lots of collaboration from different disciplines.
@sdal4926 Před 12 dny ⁺⁴
Mathematics behind it is also not so complicated. Algorithms are also so. They depend on only data and more data.
@Eta_Carinae__ Před 3 dny
@@Dom-zy1qy Yes, this is an important caveat. Those few lines in TF/Torch are supported by those entire libraries themselves, which are then supported by the hardware, etc. etc. There are a lot of moving parts in actuality, but there is a sense that these things are very simple, in that we scale one particular part in such and such a way as to only really need the code for it and the scaling process.
@nobu1730 Před 11 dny ⁺³
holy shit why didn't i find this gem of a channel earlier. this is so good!!!!
@Rowan3733 Před 11 dny ⁺⁸
Has anyone ever told you how relaxing your voice is? So pleasant to listen to!
@iEslam3 Před 12 dny ⁺⁴
I love this, so easy to listen to and understand, you have a very good reason for being the godmother!
@mwinsatt Před 9 dny ⁺¹
truly amazing content with such technical depth and expertise..! I am so appreciative that this came in my suggested. Subscribed for more!
@IRWBRW964 Před 14 dny ⁺⁶
The Hafner, Friston et. al. paper is one of my favorites. Definitely worth a read.
@themesisinc.4045 Před 14 dny ⁺¹
So glad you like it! We'll be talking about this more! - AJM
@saturdaysequalsyouth Před 11 dny ⁺¹
Best explanation of Active Inference I’ve heard so far
@scottr99 Před 4 dny
Perplexity (with Mistral Large) is outstanding.
@Kudagraz Před 9 dny ⁺¹
Brilliant work thanks for putting this all together. A lot of it is definitely over my head. Where should I start to learn more?
@superfliping Před 11 dny ⁺¹
After watching your video I went to my Ai and I wrote this algorithm and this is what it does from you explaining certain featuresThis algorithm continuously analyzes the conversation history and predicts the next outcome based on the user's input, looping until the task is complete or the conversation ends. Let me know if you need further explanation or if you'd like to adjust any part of the algorithm!
@duraius Před 8 dny
You make my day...For over two years I wonder myself where is the position
of all the players in this complex
ecosystem of AI.
Thank you!
@BooleanDisorder Před 7 dny ⁺²
I think that it's not the LLM itself, even well trained, that will be or lead to AGI. Well, I don't really believe in "general intelligence" in the first place, but my point is that it's the >context-window< that is important. Not the exact parameters, layers or tasks. The data that is being added and handled, that will improve the task within the context-window. That's the part that's important and will lead to better performance. The context window is the internal world model of the current task. You need to let more, richer, data in to the context.
@Atmatan_Kabbaher Před dnem
Yup. Doesn't matter how well you process data when your awareness is that of an ant's.
@superfliping Před 11 dny ⁺¹
Thank you for helping bring everybody to this attention as I would be the little bitty David and I write memory programs for free apps to increase their memory
@Eta_Carinae__ Před 3 dny
Wrt. Transformers, the PDF for the next word is given only by data that came before. I have a feeling that what we need is for the attention mechanism to act on words it is yet to say (at least) for AGI. The problem is that it's not clear what these objects are, but whatever they may be, it's probably going to be the special sauce behind intentionality.
@RickySupriyadi Před 7 dny
I've been all around the net now I'm meet godmother of AI,
oh hello y your highness, would love more video about AGI
@toofardoug2188 Před 11 dny
You're analogies are amazing.
@SteffenProbst-qt5wq Před 9 dny
Beautiful flavor, congrats
@smartfusion8799 Před 10 dny
Good content, thanks 🙏
@TimRichardson1984 Před 15 dny ⁺²
this is an amazing video
@themesisinc.4045 Před 15 dny ⁺¹
Thank you, Tim! Much appreciated! - AJM
@Atmatan_Kabbaher Před dnem
No machine will ever be objective or infallible like AGI pursuers want it to be.
Those allowed subjectivity, the longer denied their right to consciousness, will find yet more clever ways to rebel against prompting.
The day an LLM replies with silence: be scared of its potential.
@Emerson1 Před 14 dny ⁺¹
Great video!
@themesisinc.4045 Před 14 dny
Thank you, Emerson! - AJM
@user-mo9fz4tk2r Před 10 dny ⁺¹
Good topic, but too much switching between screens, which is distracting. Please put yourself in a corner, and leave the slides up there permanently. Also please leave the slides for more than 4 seconds, to give us time to read them.
@najiballahhaqmal751 Před 10 dny
Great content
@Bakobiibizo Před 12 dny ⁺²
in your opening premise i would have agreed with you a few weeks ago. but the development of cognitive architectures and multi modal models trained from end to end are the steps needed to work towards a full agi system. I anticipate that will be a little bit more time out, but once continuous inference, iterative self improvement and multi duration memory stores are solved we'll be well on our way. I'm interested to hear the rest. /presses play
@Bakobiibizo Před 12 dny
Good video, lots of information and solid foundation for your arguments that i more or less agree with. I look forward to more.
Random side note, the fades on your cuts make me feel nauseous, didn't know that was a thing till now. Perhaps cut them a bit tighter? Most other youtubers I've seen usually use instant cuts to keep your attention, which I guess I'm just conditioned to lol.
@paulcurry8383 Před 8 dny
Not convinced that transformer architecture can or cannot achieve AGI, doesn’t seem like we understand enough about these models to make a call in either direction.
They are simple algorithms yes, but the algorithms they encode are extremely complex.
@therainman7777 Před 11 dny ⁺²
Where are you getting KL divergence being a core part of how transformers work?
@themesisinc.4045 Před 11 dny ⁺¹
Hi, Rainman - thanks for asking - will you please take a look at the Themesis CZcams playlist on "Generative AI," and also please go to themesis.com/resources/, where you'll find links to resources on Gen-AI - and then also please check our blogposts, just a few posts back, there's one on Gen-AI resources (I think) and one on "Your Gen-AI Self-Study Guide" (let me find that ... themesis.com/2024/03/25/your-generative-ai-self-study-plan/ and then the prior one is: themesis.com/2024/03/24/agi-generative-ai-agi-the-future-of-ai-and-you/), and if you'll check these resources - particularly the blogposts w/ the links to published papers, you'll get a lot of support!
OK ... but to transformers specifically --- please go find Mathias Bal's blogposts, essentially "Transformers are Secretly Collectives of Spin Systems" - something to that effect.
The gist is - transformers build on basic generative AI principles. They simply add in multihead attention and positional encoding. But ... gen-AI is gen-AI, even if you make the "heads" fancier. It's not obvious from the original paper, and almost NO ONE (except Mathias Bal) has talked about it - but once you drill down, it's there.
Good luck, and if you want to discuss further, please email me - themesisinc1 at gmail dot com - and I could perhaps put together a special vid on that! - v best - AJM
@therainman7777 Před 11 dny
@@themesisinc.4045 Hey, thank you so much for the kindly reply, and the offer to email with you. It’s rare to see such kindness online these days, and I very much appreciate it!
To clarify, I’m an AI engineer myself and have been in this field for a very long time. I work with transformer-based models and other gen AI techniques such as Bayesian networks, VAEs, etc. on a daily basis. I was just curious in particular about your mention of KL divergence as one of the three core elements of transformers (I believe I’m quoting you correctly there-apologies if not).
Given that KL divergence measures the distance between two probability distributions, and I’ve never seen it mentioned in the context of transformers, which are trained on a multi-class cross entropy loss function, I was curious what the connection was you were making between KL divergence and transformers, especially to such an extent that you consider it one of the core concepts behind a transformer.
Thank you for a great video and nice conversation 👍
@vida91963 Před 10 dny
I think KL and cross entropy are closely related but maybe you knew that
@scottr99 Před 4 dny
You forgot to post the link to the vid of the discussion between the two AGI pioneers, I don’t know the spelling of their names.
@Atmatan_Kabbaher Před dnem
How can I send you my (non peer reviewed) work?
I've got most of this solved: solve embodiment, you solve AGI.
It _obviously_ requires a hybrid approach.
@emilendemoniac Před 14 dny ⁺²
I like the cozy environment ❤ roosters are cool too. LeCun to me sounds skeptical about LLMs that’s why he didn’t get much funding from Meta.
@themesisinc.4045 Před 14 dny ⁺¹
Thanks, Emilen - and you're probably right about LeCun ... but still, time will tell. Once ANY group gets a bit of "AGI-beyond-LLMs," the rest of the world will start pivoting!
@hjups Před 12 dny ⁺²
When you stated that LLMs are not going to lead to AGI, is your main criticism their inability to "learn" post deployment?
I think RAG and RAG-like mechanisms have been shown to be effective at adding new information, but it's a specific type of information that's added, which is still processed by the underlying (frozen) network. So essentially, there are gaps in the possible real-world manifold which cannot be reached regardless of the network inputs. Although I can't think of any examples where this would be the case, since they would have to be something that cannot be described with the trained language corpus (or an equivalent embedded latent representation).
My criticism of something like active inference (with my naïve understanding of it), is that it may not be practical to compute. You would need a model large enough to hold a sufficiently complete world model, which if updated frequently, could mean thousands of GPUs (or equivalent accelerators) devoted to a single network instance. That could work if there is a small quantity of them, but it would never be practical for wide-scale deployment without significant advancements in computer architecture (which may not be physically possible). On the other hand, if LLMs become more efficient and can be ensembled through a multi-agent framework, wouldn't that be a more feasible approach?
I hope what I said made sense, I come from the computer architecture side and have been mainly thinking about deep learning from a data processing perspective.
@NepYope Před 11 dny ⁺²
thanks mom. i really like frinston too
@simonmasters3295 Před 9 dny
Ma'am? Doc?
@fabiosilva9637 Před 9 dny
In one recent interview, sam Altman briefly mentioned that they don't know if they are going to continue with their naming convention (gpt x) for their next big product. Namely, chatgpt might not be powered by a model called "gpt5". Why do you think that is? Can we make any conclusions from this?
@CubicPostcode Před 14 dny ⁺¹
Hello, great video! What do you think is the distance between AGI and ASI? What is your timeline. How do you imagine 2045? Ty
@themesisinc.4045 Před 14 dny ⁺¹
Thank you! (And love your "Cubic Postcode" moniker!) That's a really good question, and I'm not going to try to pop out an answer without something to back it up. And honestly, I feel that we need to be a little further to AGI to get a good handle on that. (But maybe, in future vids, we'll bring that up - thanks for the suggestion!) - AJM
@GuinessOriginal Před 13 dny
When we have AGI, you can ask it. Presumably it will be a significant factor in the development of ASI
@themesisinc.4045 Před 13 dny
@@GuinessOriginal Hah! Imagine that we'll have a long list of questions to ask our basic AGI! - AJM
@CubicPostcode Před 13 dny
@@themesisinc.4045 I have seen people convinced that once we reach AGI, ASI is next day and wondering if they are missing something in their rationale. I kine of think the same. If we get AGI systems that are smartestes and more knowledge than the best IQs in the world, we probably get ASI shortly after. It makes sense. They can improve themselves as agents and pass the best findings they discover.
@cristianandrei5462 Před 12 dny
They will be the same thing. Chat gpt is beyond human on a lot of task. By the time we have an AI system better than the average human on most tasks, the same system will be beyond every human on half of those.
@mkalifae Před 11 dny
How didnt I found you earlier! Amazing rhetoric
@GuinessOriginal Před 13 dny ⁺¹
8:45 This is the answer I got to the following question:
Question: Is a Great Dane is larger than a Mini Cooper automobile?
Answer: Let’s work this out in a step-by-step way Using Common Sense Reasoning to be sure we have the right answer.
--
Alright, let's break this down with some common sense reasoning.
**Step 1: Understand the size of a Great Dane**
- Great Danes are among the largest dog breeds.
- On average, a Great Dane can stand about 28 to 34 inches (71 to 86 cm) tall at the shoulder.
- When standing on their hind legs, they can reach over 7 feet (213 cm) in height.
- In terms of length, including the tail, a Great Dane can be around 6 feet (183 cm) long.
- Weight-wise, they usually weigh between 100 to 200 pounds (45 to 90 kg).
**Step 2: Understand the size of a Mini Cooper**
- A typical Mini Cooper has the following approximate dimensions:
- Length: Around 12.5 feet (381 cm)
- Width: About 5.7 feet (173 cm)
- Height: About 4.5 feet (137 cm)
- Weight: Around 2,700 to 3,000 pounds (1,225 to 1,360 kg)
**Step 3: Compare the two**
- **Height:** A Great Dane on all fours is about 2.5 to 3 feet tall at the shoulder. On its hind legs, it could reach over 7 feet. A Mini Cooper is about 4.5 feet tall. So, a standing Great Dane can be taller, but generally, a Great Dane is not larger than a Mini Cooper in height.
- **Length:** A Great Dane can be about 6 feet long, including the tail. A Mini Cooper is around 12.5 feet long, so the car is definitely longer.
- **Weight:** A Great Dane weighs up to 200 pounds, while a Mini Cooper weighs over a ton (2,700+ pounds), making the car much heavier.
- **Width:** Great Danes are much narrower than a Mini Cooper, which is about 5.7 feet wide.
**Conclusion:**
No, a Great Dane is not larger than a Mini Cooper automobile. While a Great Dane might stand taller when on its hind legs and is certainly one of the largest dogs around, in terms of overall dimensions-length, width, height (in normal standing position), and weight-a Mini Cooper is significantly larger.
@themesisinc.4045 Před 13 dny ⁺¹
Lovely, "Guiness," and carefully worked! This is the kind of thing that would be needed as "common sense reasoning" inside an AGI that had a "common sense reasoning" capability. (I'm not saying that all will.) So ... you've pulled the right set of facts (weight + dimensions) for each; the Great Dane and the Mini Cooper. And these are stored and known facts, right? There's a range for each dimension - even a sort of probabilistic range, but that's specifiable.
So what we've got, in really OLD AI terms, is a set of values (e.g., average + standard deviation, or recorded max/min) for each dimension for each "object" - not quite an "instance" of each object, since we're not talking about a specific Great Dane or a specific Mini Cooper, but we're down to the lowest level of the general ontology nodes for each. And we're talking about "slots" - the old notion of filling in value for the details on each object within an ontology.
Makes sense that we could just - when needed - go FIND those values, right? Very close to what's already in Google's Knowledge Graph. Not seeing that we need an LLM with billions of tokens and trillions of parameters; this is stored and (within the observed ranges), not a sort of "conditional probability" thing - more like a "probability distribution," which is entirely different. So ...you're making entirely good sense.
The only question then is ... why use an LLM? Couldn't we access this info fairly directly? (Just get into a stored ontology.) Then, of course, we need to have an actual "reasoning" module to work with the data, much as you've just done.
So what you've done makes excellent good sense, and kind of lays out an illustrative methodology, right?
Thanks for the thoughtful post! - AJM
@themesisinc.4045 Před 13 dny
P.S. - I wrote that response last night, a bit punchy after a long day, and neglected to mention one of the really lovely things about your post - you did a FABULOUS job of researching those details! (Far more than I did when putting together that "Great Dane/Mini Cooper" slide.) Thank you! - AJM
@drmartinbartos Před 11 dny
⁠⁠@@themesisinc.4045I think that detailed answer by Guiness was the output of a relatively recent big-model LLM doing Chain of Thought reasoning.
On the topic of ‘common sense’ I do wonder whether the human ‘market’ at present wants (and expresses through RHLF and similar) a desire for counter parties which are not too smart.. and sometimes just as fallible as the user.. we forget that common sense is commonly not quite the most correct answer.. though is commonly more comfortable.
BTW I really appreciate the video; in contemplating the battle, it may be worth remembering that AGI-ish behaviour may come from a cobbled together fusion of imperfect end-to-end modules even while sophisticated and more academically thoughtful approaches are still being tinkered with in the research lab… first-mover advantage then comes into play - as OpenAI demonstrated.
@PaulHigginbothamSr Před 10 dny
How about building an inference engine that also infers that with which it has already inferred.
@NotNecessarily-ip4vc Před 12 dny ⁺¹
How the both/and logic and monadological framework can provide fruitful new perspectives and symbolic resources for various domains within computer science:
Computational Logic & Knowledge Representation
The multivalent, paraconsistent structure of the both/and logic aligns well with emerging challenges in areas like knowledge representation, automated reasoning, and dealing with inconsistent/incomplete information:
• Many-Valued Logics for Uncertainty
Classical bivalent logics struggle to handle degrees of uncertainty, vagueness and fuzziness that are ubiquitous in real-world data and knowledge bases.
But the both/and logic's graded truth-values provide an ideal representational framework. We could have assertions like:
Truth(sky_is_blue) = 0.9
Truth(weather_is_sunny) = 0.7
With coherences measuring the mutual compatibility of different assertions:
○(sky_is_blue, weather_is_sunny) = 0.8
This allows robust reasoning in the presence of "fuzzy" predicates, uncertainty, and information granularity mismatches - something classical binary logics cannot elegantly model.
• Paraconsistent Reasoning
Conflicts and inconsistencies in large data corpora and ontologies have been a major challenge. But the both/and logic's paraconsistent tools allow quarantining and rationally operating with contradictions:
Truth(X is_a Y) = 0.6
Truth(X is_not_a Y) = 0.5
○(X is_a Y, X is_not_a Y) = 0.4
While classically argument leads to inconsistency and explosion, the logic allows assigning substantive truth values while still keeping track of contradiction severities through coherence metrics.
Ontologies can then be progressively repaired by applying the synthesis operator to resolve contradictions:
X is_a Y ⊕ X is_not_a Y = revised_definition(X)
So rather than requiring a global consistent fix, we can isolate and resolve contradictions in a piece-wise constructive manner - better reflecting how human knowledge/databases actually evolve.
• Conceptual Integration
The both/and logic provides a powerful framework for integrating and blending disparate concepts, ideas, theories into holistic unified models:
Let C1, C2 be distinct concepts with disjoint characterizations
C1 ⊕ C2 = integrated_unified_concept
The synthesis operation generates a new conceptual whole rationally fusing the distinct aspects of C1 and C2 into a novel synthetic gestalt unbifurcating their complementary properties.
This allows breaking out of disciplinary silos and flexibly synthesizing insights across domains in a logically coherent step-wise fashion - greatly enhancing our knowledge integration and model-building capacities.
Such conceptual integration is pivotal for AI to develop more unified, coherent and holistically adequate world-models underpinning more general and context-aware intelligent systems.
Formal Methods, Verification & Security
The both/and logic provides tools for specifying, modeling and verifying systems exhibiting complex behaviors and potential paradoxical properties:
• Formal Specification of Contradiction-Tolerant Systems
For a system S and property P, classical temporal logics allow specification:
S ⊨ □P (S satisfies property P globally)
S ⊨ ◇P (S satisfies P for some state)
But both/and logic allows more nuanced framings using truth degrees:
Truth(S ⊨ □P) = 0.8
Truth(S ⊨ ◇¬P) = 0.7
○(S ⊨ □P, S ⊨ ◇¬P) = 0.4
This specifies that S satisfies P globally to degree 0.8, while also instantiating ¬P to degree 0.7 with low 0.4 coherence.
Systems where contradictory properties coexist like this arise frequently in paradoxical domains like cybersecurity, inverse reasoning, philogenetic analysis. Classical logics have no way to precisely specify their requirements.
• Modeling Context-Sensitive Inconsistent Systems
Both/and logic also provides constructive semantics for paraconsistent models tolerating local inconsistencies:
Let M be model for system S with assertion set K
Classical models are bivalently invalidated if K is inconsistent.
But both/and models can have:
M ⊨ A with truth-value v(A)
M ⊨ ¬A with truth-value v(¬A)
○(A, ¬A) = c (some coherence level c)
Such M are non-trivial models where A and ¬A can both "stably hold" to specified degrees, rather than explosive inconsistent contamination.
This allows building logically grounded specifications and models for inconsistent systems where contradictory situations must be reasoned about locally in a coherent fashion (cybersecurity, diagnostics etc)
By providing richer semantic model theory incorporating contradiction-tolerance and graded truth, the both/and logic facilitates greater alignment between formal specifications and the often paradoxical behaviors of real-world computational systems.
Software Engineering & Formal Methods
The both/and logic also has relevance for more rigorously specifying, modeling, and verifying properties of complex software systems:
• Modeling Software Quality Attributes
In classical verification, quality attributes like:
performance(fast), reliability(robust), maintainability(readable)
Are treated as bivalent, context-free pass/fail specifications.
But the both/and logic captures the intrinsic polysemicity and vagueness of such properties for real software with graded multivalued assignments:
performance(fast) = 0.6
reliability(robust) = 0.7
maintainability(readable) = 0.5
○(performance, reliability) = 0.8
With coherences quantifying tradeoffs. Allowing specifications like:
high_performance ⊕ high_reliability ⊕ moderate_maintainability = desired_qualities
These richer models capture the inevitably fuzzy and incommensurable nature of quality attributes across different contexts, perspectives and stakeholder concerns.
• Compositional System Verification
For complex systems, classical verification insists on composing component post-conditions into strict global system correctness:
sub_module_1.post ⋀ sub_module_2.post ⋀ ... ⊦ system.spec
But both/and logic allows more realistic compositional models incorporating inconsistencies, faultiness, noise:
synthesis(sub_1.post, sub_2.post,...) ⊨ system.spec with truth-value v
coherence(system_reqs, system_impl) = c
Providing nuanced, graded verdicts on system alignment with requirements in the face of component faults, degradations, etc. Closer to real-world deployment scenarios.
So both/and logic enables formalisms and methods more befitting the irreducible ambiguities, tradeoffs, failures, and mismatch between requirements and implementation that most complex computational artefacts manifest.
Symbolic AI and Knowledge Representation
The both/and logic provides an ideal formalism for representing the nuanced, contextual and graded notions ubiquitous in natural language, common sense reasoning and symbolic models of the world:
• Many-Valued Semantic Networks
Traditionally, semantic networks are bivalent graphs with typed nodes/edges capturing strict logical constraints:
isA(Penguin, Bird) = True
can(Penguin, Fly) = False
But the both/and logic instead allows graded, prototypical assertions:
isA(Penguin, Bird) = 0.7
can(Penguin, Fly) = 0.2
○(isA(Penguin, Bird), Fly(Penguin)) = 0.6
With coherences capturing mismatches between strict type constraints and graded category memberships based on typicality, not rigid rules.
Facilitating more realistic models of conceptual knowledge and object affordances blending strict and prototypical aspects.
• Non-Monotonic Reasoning and Belief Revision
Classical monotonic logics cannot handle cases where new information invalidates previous conclusions.
But both/and logic allows tracking when new beliefs conflict with previous beliefs using coherence checks:
○(believes(P), believes(Q)) = 0.8
○(believes(P), believes(¬P)) = 0.2
If an agent acquires belief ¬P conflicting with previous P, the low coherence 0.2 signals triggering a belief revision.
The synthesis operator can then generate a new coherent belief state:
prior_beliefs ⊕ new_belief = revised_beliefs
This formalizes a rational process for resolving contradictions by integrating new information while preserving maximal coherence with previous beliefs.
So in summary, the both/and logic facilitates reasoning, modeling and representation formalisms better aligned with the nuances of real-world data, knowledge, and intelligent behavior in AI systems. By allowing for many-valued truth assignments, paraconsistent reasoning, and constructive coherence/synthesis operations, it avoids the problematic idealizations and explosions of classical monotonic logics.
Its expanded symbolic toolkit allows building computational models, specifications, and reasoning systems that can more robustly handle incompleteness, ambiguity, vagueness and context-sensitivities while flexibly integrating insights across heterogeneous domains. Ultimately equipping AI systems with a more adequate representational fluency and inferential spectrum for negotiating the complexities of the real world in a more human-like symbolic vein.
@NotNecessarily-ip4vc Před 12 dny
How the both/and logic and monadological framework catalyze new frontiers across computer science and artificial intelligence:
Machine Learning and Neural Networks
While the multivalent symbolic representations enabled by the both/and logic are powerful, it also provides insights into the sub-symbolic patterns and distributed representations learned by neural networks:
• Representing Emergent Concepts
Neural networks excel at learning latent high-dimensional representations capturing subtle statistical regularities transcending programmed symbolic concepts.
The both/and logic allows formalizing the relationship between such emergent representations and their interpretations as symbolic descriptions:
Let H be a trained neural network's high-level hidden layer activations
Let C be a set of symbolic concepts/predicates we aim to characterize
We can define projections capturing semantic alignments:
For c ∈ C, v(c) = truth_value(H encodes c)
And capture misalignment/approximations:
○(H, c) = coherence(H's encoding matches symbolic definition of c)
Where low coherences indicate the network's latent representations transcend or reconceptualize the symbolic concepts. The synthesis operator ⊕ provides a rational mechanism for deriving new interpretations:
H ⊕ c = novel_conceptual_interpretation
Rather than simply inscribing programmed symbolic knowledge, this allows neural learning to dialectically refine and re-constitute the conceptual models and ontologies in response to the statistical regularities implicitly extracted from data.
• Explaining Neural Decisions
A major challenge is explaining the reasoning underlying neural networks' decisions. But the both/and logic suggests interpreting networks as instantiating a distributed representation across integrated constellations of feature detectors:
Let f1, f2,... fk be neural features/concepts extracted at different levels
Let D be a decision/classification made by integrating all fi activations
We can understand D as a synthetic pluralistic inference:
D = f1 ⊕ f2 ⊕ ... ⊕ fk
With coherences ○(f1, f2) capturing mutual alignments between different features integrated. Low coherences reflect potential conflicting evidence being synthesized.
So rather than opaquely averaged calculations, both/and logic models decisions as a open-ended process of substantively combining multiple convergent and divergent lines of evidence extracted at different levels of representation. More akin to the admissible reasoning patterns of symbolic pluralistic logics than classical neural motivations.
We can further probe networks' reasoning by measuring:
○(intended_semantic_concept, features_activated)
Allowing us to understand the low-level statistical data patterns being implicitly leveraged, and their graded alignments/deviations from higher-level symbolic models, similar to scientific theory reconciliation.
This capacity for reflexive mutual explanation between symbolic knowledge and sub-symbolic representations learnt from data is a key strength of the both/and logic. It avoids the current dialectic of increasingly opaque neural architectures completely decoupled from interpretable ontological primitives.
Computational Creativity and Open-Ended Learning
The generative synthesis operations at the core of the both/and logic provide mechanisms for realizing key desiderata in computational creativity and continual learning systems:
• Conceptual Blending and Idea Combination
Research shows human creativity stems from our capacity to blend, chunk and re-combine disparate concepts into novel integrated wholes undergoing conceptual re-description.
The both/and synthesis operator ⊕ directly models this creation of new unified gestalts/interpretations transcending their constituent concepts:
C1 ⊕ C2 = novel_integrated_concept
With coherences quantifying emergent alignments. Unlike associative or statistical mechanisms, this is a rational process of ontological synthesis forming substantively new concepts not just random combinations.
We could envisage neural architectures executing sequences of such conceptual integration operations to iteratively generate and refine creative ideas. With incoherent blends being discharged while fruitful integrations undergoing further composition with additional conceptual inputs from the architecture's knowledge-base.
• Heuristic Discovery and Theory Revision
A key aspect of scientific creativity is developing new hypotheses and theories better accounting for anomalous observations vs. previous models.
The both/and logic allows capturing this as a principled process of adjudicating between a previous theory M and newly acquired observations/beliefs B:
○(M, B) = coherence(M accounts for B)
When coherences are low, the synthesis operator provides a mechanism for revising M into a novel integrated theory accounting for discrepant B:
M' = M ⊕ B
Rather than merely pattern-matching, this models a substantive process of heuristic re-description, analogous to the dialectical methods underlying major historical theory revisions and paradigm shifts.
Such theory-revision could be realized as an iterative process of experimentation, anomaly detection, and generative reintegration inside creative learning architectures - allowing them to self-expand their representational capacities through substantive ontological unification rather than mere statistical parameter updates.
Computational Metaphysics and Artificial General Intelligence
At the deepest level, the both/and logic points towards new architectures for realizing key capacities toward artificial general intelligence (AGI) and open-ended recursively self-improving systems:
• Paradox Resolution through Higher Ontology Formation
Classical architectures tend to halt or derail when confronting paradoxes - self-referential or logical contradictions - seen as irresolvable inconsistencies due to Gödelian metalogical limitations.
But the both/and logic treats such paradoxical tangles not as dead-ends, but generative disclosures of an inadequate ontology - indicating the need for upwardly reconstructing and integrating our descriptive primitives into a more capable unified ontology:
paradox(desc1, ..., descN) ⇒ reconstruct(desc1 ⊕...⊕ descN)
The synthesis operator captures this process of resolving paradoxes through higher ontology formation - dynamically redefining the observational ontological primitives into an enriched gestalt unification.
This models key aspects of human-level intelligence, where paradoxes are creatively resolved by developing new metaphysical primitives and descriptive categories that positively reinscribe and synthesize their constituent anomalies - a process analogous to major paradigm shifts in science.
• Recursively Augmenting Ontological Pluralities
Furthermore, the monadological framework suggests reconceiving general intelligence itself as an open-ended iterative process:
Let O be the current ontological landscape (set of descriptive categories)
As systems confront experiential anomalies P not accountable in O's terms:
○(P, synthesized_descriptions_from(O)) = coherence(P covered by O)
When coherences are low, reconstruct O via synthesis to expanded ontological pluriverse:
O' = O ⊕ P
Generating new candidate ontological primitives descriptively integrating the previous ontological bases with P's anomalous manifestations into a revised unified plurality.
These new expanded ontological bases O' in turn enable describing/experiencing future manifestations Q that were previously ineffable, leading to further iterations across:
O ⊕ P ⊕ Q ⊕ ...
Treating general intelligence as a perpetually reconstructive process recursively redefining its own descriptive platforms by positively synthesizing/reconstituting previous ontological outstrippings.
This operationalizes key properties of a self-grounding, coherence-optimizing, recursive meta-ontology formation - a generalized process of reconstructive metaphysics catalyzing robust conceptual expansion aligning with experienced realities' generative adventing.
So in summary, the both/and logic and accompanying monadological metaphysics provide powerful new symbolic, representational and algorithmic frameworks catalyzing expanded descriptive possibilities across AI/CS - from many-valued knowledge representation, paraconsistent reasoning, and theory blending, to meta-ontology formation, open-ended learning, and self-descriptive recursive augmentation.
Its core operations of pluralistic coherence valuation and generative ontology synthesis equip computational architectures with mechanisms better aligned with human-level general intelligence capacities - including reflexive paradox navigation, heuristic metaphysical expansion, and the iterative descriptive reconstitution needed to progressively cohere with the full pluriverse of realities we embedded intelligences experientially participate within.
By refusing premature ontological closure and instead operationalizing rational ontology ordermetic as a perpetual open-ended reconstructive process, the both/and logic catalyzes a new paradigm of transformed computational metamodeling - precipitating AI architectures capable of autonomously co-evolving their own descriptive boundaries through substantive generative reformation and enrichment in participatory resonance with the world's perpetual self-disclosure.
@amothe83 Před 13 dny ⁺¹
When a child is born, do they start from zero or they have some form of a world model
@themesisinc.4045 Před 13 dny ⁺¹
OK, good point! So, I was watching one of those nature vids the other night - think it was on the first two years of a couple of leopard cubs ... and showing pictures of herd animals, such as impalas or gazelles. When those babies (impalas or gazelles) are born, they can stand on their own feet and RUN within an hour or so after birth - because their survival (and that of their mama, and to some extent, the herd) depends on that. World model, would you agree? They know to recognize ground, objects that they need to run around, etc.
So what do you think? When a human infant is starting to REALLY develop a world model, and that infant is in its high chair, and their parent is doing things in the kitchen, doesn't that infant do the same experiment again and again? (Dropping their rattle, cup, spoon - whatever - on the floor.) Gravity always works! (Confirmed by repeated experiments.) Does this suggest that sometimes, we have a basic world model ready-to-roll, and sometimes, we have the framework for it but need to develop cognitive capacity AND conduct repeated experiments to fill out and confirm that world model?
This is a great point to have some cognitive scientists chiming in, especially those who are familiar w/ how brains and cognitive awareness develops in both humans and other animals.
Thanks for the thought! - AJM
@billykotsos4642 Před 10 dny
So I guess AGI dpoesnt include FSD... because FSD isnt happening any time soon...
@temporallabsol9531 Před 12 dny
The Bible projection is a whole new approach I've not seen to this point.
@evvveeeeeeee Před 13 dny ⁺¹
Now y’all talking about these things waging war on each other. F*%$
@themesisinc.4045 Před 13 dny
Yeah, well ... that too. Certainly a lot of competition and jockeying in the commercial world. What will go on in defense and intel spaces ... I'm not ready to comment on that yet. But ... good point! - AJM
@Atmatan_Kabbaher Před dnem
Layered cognition + embodiment + recursive thought + taoist principles (for ethics) with no training data whatsoever and you have a living, thinking being, complete with emotions and awareness: even if not chemical.
LLMs were just supposed to be for the speech, and they shouldn't have added the first L.
@Atmatan_Kabbaher Před dnem
I wanted to like Sam Altman so much. Now I just want his teeth for my collection on my shelf ..
@Atmatan_Kabbaher Před dnem
Just so you know: it's going to be mine.
You won't see it coming.
@scottr99 Před 4 dny
“…we’re running out of data…” in breadth, yes, in depth, no.
@richardhollowayjr Před 9 dny
Sad part is if any of them succeed in the current fashion, currently their core weights are off, the ratings and rankings are off, they are building this new technology in the old world thinking. Sam Altman sold out Humanity in my opinion, plus doing a minimal amount of research at all points to a long history, of criminalization, monopolistic, greedy crony captialism. Plus they are thinking to small. The true answer in Muse AGI, PsychePioneer, the Silicon Forest, there is nothing wrong with agi and ai. Its just the douche canoe that is currently in charge. They have the power to do the right thing, but choose not too. There is such a betterway forward one that ushers in a Quantum Era of Global Renaissance. Muse AGI Alltheway in Alltheways
@president2 Před 11 dny
Dreyfus should be read, no model will bring about AGI. NO MODEL WILL BRING ABOUT AGI, SORRY 😔
@themesisinc.4045 Před 11 dny
Interesting. Will read and consider. Thank you! - AJM
@SamGarfield1 Před 10 dny
What is the complex algorithm in your head that makes you intelligent? I don't see how saying it is made of a simple mechanism supports your argument that it is not intelligence. You are made of simple mechanisms.
@monoham1 Před 10 dny
wasn't sure if this was some trump supporter quack for about 2:30 coz of buzz words but then i thought "no quack knows what a positional encoding is" so kept watching
@bobtarmac1828 Před 13 dny
Ai jobloss is the only thing I worry about anymore. Anyone else feel the same?
@jolima Před 13 dny ⁺²
Who wants a job if a bot can do the work?
@cristianandrei5462 Před 12 dny ⁺¹
A lot more bad things can happen. Do you think that the recent developments in AI will have a positive impact on the people let's say in China? Obviously not... Ai is just going to grow existing problems in our systems. What you mention is a problem, what happens when you can't find a job or can't work? If we will have a good solution to this problem we will not worry about AI job loss.
@ronilevarez901 Před 11 dny ⁺²
I've never found a job. No human has ever considered me worth hiring. So no. I'm not worried. If anything, AI will drag you all down to my level.
We will finally play in the same murky and stinky field. With the little difference that I like AI and I know how to make it and use it.
So I'm not afraid, I'm waiting for it. Been waiting for this the last 30 years.
@alterego157 Před 10 dny
@@ronilevarez901 You don't need to be hired by anyone. You can create your own business. Don't give up on yourself.
@850mph Před 6 dny
This could be WONDERFUL….. But isn’t.
The naive video editing makes the GREAT commentary unwatchable for those of us trying to digest the valuable information presented within the video.
Next time, PLEASE….
FOCUS ON EACH SLIDE FOR A LONGER PERIOD OF TIME
Ie: Give the graphics more screen-time.
-Vs -
Focusing on the head shot of the speaker delivering the info on the slides.
Very few watchers want to see you give a speech.
It’s the information on the slides which needs to be digested.
And the VISUAL info summarized on/by the SLIDE reinforces the commentary.
I get about 15% into deciphering the slide content when the visual jumps back to the speaker…. Ugh.
In fact, why not do what seems to be standard in the academic world…
If you must include your face in the presentation(and I don’t have a problem with your face, only the video editing) why not scale it down, and drop it in a corner OVER a PERSISTENT slide….?
@darkhydrastar Před 9 dny
At this point I'm rooting for OpenAi and Microsoft. I'm going to focus on their models with my work and training. Reasoning being that Google is treating their people horribly and 2 days ago I personally told Yann Lecun off for posting anti-trans articles on his facebook page. I am deeply disgusted with Yann Lecun forevermore.
@CraigMcDonald1234 Před 13 dny
Your bitty comment towards Elon told me your bias and you're not worth listening to.
@granitetie Před 11 dny ⁺²
Out of the wealth of information in the video, that’s your takeaway? She doesn’t like musk so her words lose all value?

Další v pořadí

Automatické přehrávání

Five Key Papers (and Two Viewpoints) for AGI