> A modern sparse Transformer, for instance, is not "conscious," but it is an excellent engineering approximation of two core brain functions: the Global Workspace (via self-attention) and Dynamic Sparsity (via MoE).
Could you suggest some literature supporting this claim? Went through your blog post but couldn't find any.
Could you suggest some literature supporting this claim? Went through your blog post but couldn't find any.