diff --git a/presentation.txt b/presentation.txt index ccc1373..bbd05fc 100644 --- a/presentation.txt +++ b/presentation.txt @@ -1,623 +1,973 @@ - page 1/102 Today I'll talk about my work as a Ph.D. student at the University of Torino. - I will deal with aspects related to phenomenology from the point of view of strings for which I worked both on theoretical and computational levels. + + I will deal with aspects related to phenomenology from the point of view of + strings for which I worked both on theoretical and computational levels. - page 2/102 - The plan is to first introduce a semi-phenomenological description of physics from a theory of strings. - These tools will be used throughout the talk as basis for the rest of the topics. + The plan is to first introduce a semi-phenomenological description of physics + from a theory of strings. + + These tools will be used throughout the talk as basis for the rest of the + topics. - I will then mainly deal with open strings and analyse correlators in the presence of twist and spin fields seen as singular points in the string propagation. + I will then mainly deal with open strings and analyse correlators in the + presence of twist and spin fields seen as singular points in the string + propagation. - From particle physics, I will then move to a string theory description of cosmology in the presence of time dependent singularities in time dependent orbifods. - I will try to give a reason for the presence of divergences in amplitudes due to particular interactions of string modes. + From particle physics, I will then move to a string theory description of + cosmology in the presence of time dependent singularities in time dependent + orbifods. Finally I will focus on computational aspects of string compactifications. - In particular I will deal with recent advancements in deep learning and artificial intelligence applied to the computation of Hodge numbers of Calab-Yau manifolds. - page 3/102 - I will therefore start with the most geometrical aspects of the discussion, reviewing some basics and introducing a framework to deal with open string amplitudes in the presence of twist and spin fields. + I will therefore start with the most geometrical aspects of the discussion, + reviewing some basics and introducing a framework to deal with open string + amplitudes in the presence of twist and spin fields. - page 4/102 As usual in string theory the starting point is Polyakov's action. - We start directly from the superstring extension where gamma is the worldsheet metric (for the moment generic) and rhos are the 2-dimensional "gamma matrices" forming a representation of the Clifford algebra. + We start directly from the superstring extension where gamma is the + worldsheet metric (for the moment generic) and rhos are the 2-dimensional + "gamma matrices". - page 5/102 - This description presents lots of symmetries which, at the end of the day, provide the key to its success. - In fact the invariance under Poincaré transformations in the target space, and diffeomorphisms and Weyl transformations on the worldsheet are such that the theory is conformal with a vanishing traceless stress tensor. - Using all symmetries the worldsheet metric can be brought to diagonal form. + This description presents lots of symmetries which, at the end of the day, + provide the key to its success. + In fact the invariances of the action are such that the theory is conformal + with a vanishing traceless stress tensor. - page 6/102 - Using standard field theory methods the stress energy tensor of the superstring in D dimensions generates the known Virasoro algebra, extending the classical de Witt's algebra. - At quantum level it presents a central charge whose value depends on the dimension of the target space. + Using standard field theory methods the stress energy tensor of the + superstring in D dimensions generates the known Virasoro algebra, extending + the classical de Witt's algebra. + At quantum level it presents a central charge whose value depends on the + dimension of the target space. - page 9/102 - In order for the stress energy tensor to be a conformal primary field, we need to introduce sets of fields known as conformal ghosts. - These are conformal fields specified by their weight lambda. - They are introduced as a first order Lagrangian theory. + In order for the stress energy tensor to be a conformal primary field, we + need to introduce sets of fields known as conformal ghosts. + These are conformal fields specified by their weight lambda and they are + introduced as a first order Lagrangian theory. - page 8/102 - The central charge arising from the algebra of the modes of the ghost sector can then be used to compensate the superstring counterpart. - Consistency of the theory thus fixes the dimension of the target space to be a 10-dimensional spacetime. + The central charge arising from the algebra of the modes of the ghost sector + can then be used to compensate the superstring counterpart. + Consistency of the theory thus fixes the dimension of the target space to be + a 10-dimensional spacetime. - page 9/102 - The fact that "everyday" physics in accelerators is 4-dimensional is recovered through compactification. - In simple words we recover 4-dimensional physics at low energy by considering the 10-dimensional Minkowski space as a product of the usual 4-dimensional spacetime and a 6-(real)-dimensional space. - The internal space has to obey stringent restrictions: in particular it has to be a compact manifold (to "hide" the extra dimensions), it has to break most of the supersymmetry present at high energy, and its arising gauge algebra has to contain the standard model algebra. + The fact that "everyday" physics in accelerators is 4-dimensional is + recovered through compactification. + In simple words we recover 4-dimensional physics at low energy by considering + the 10-dimensional Minkowski space as a product of the usual 4-dimensional + spacetime and a 6-(real)-dimensional space. + + The internal space has to obey stringent restrictions: in particular + it has to be a compact manifold (to "hide" the extra dimensions), it has to + break most of the supersymmetry present at high energy, and its arising gauge + algebra has to contain the standard model algebra. - page 10/102 - These manifolds have been firstly postulated by Eugenio Calabi and later proved to exist by Shing Tung Yau, hence the name Calabi-Yau manifolds. - In the case at hand, they are a specific class of 3-dimensional complex manifolds with SU(3) holonomy. + These manifolds have been firstly postulated by Eugenio Calabi and later + proved to exist by Shing Tung Yau, hence the name Calabi-Yau manifolds. + In the case at hand, they are a specific class of 3-dimensional complex + manifolds with SU(3) holonomy. They must be Ricci-flat or equivalently have a vanishing first Chern class. - page 11/102 - In general it is not easy to classify these manifolds (as well as computing for instance their metric, which is generally not known). - However, for instance, we will see that the dimension of the complex cohomology groups, known as Hodge numbers, will play a strategic role. + In general it is not easy to classify these manifolds (as well as computing + for instance their metric, which is generally not known). + However, for instance, we will see that the dimension of the complex + cohomology groups, known as Hodge numbers, will play a strategic role. - page 12/102 - Going back to Polyakov's action, we solve the equations of motion and the boundary conditions for the string propagation (we focus on the bosonic part for the moment). - The action in fact introduces naturally the Neumann boundary conditions for the strings. - Moreover the solution factorises into its holomorphic components (where z is the usual coordinate on the complex plane) due to the equations of motion. + We solve the equations of motion and the boundary conditions for the string + propagation (we focus on the bosonic part for the moment). + The action in fact introduces naturally the Neumann boundary conditions for + the strings. + The solution factorises into its holomorphic components (where z is the usual + coordinate on the complex plane) due to the equations of motion. - page 13/102 - Now consider for a second the simplest toroidal compactification of closed strings, that is suppose we can "hide" the last direction of the string on a circle. - As in usual quantum mechanics this leads to momentum quantization (defined as an integer n in units of the compactification radius). - A closed string can moreover wind an integer number of times introducing a "winding number" m stating the number of times which the closed string goes around the compact dimension. + Now consider for a second the simplest toroidal compactification of closed + strings, that is suppose we can "hide" the last direction of the string on a + circle. + + As in usual quantum mechanics this leads to momentum quantization (defined as + an integer n in units of the compactification radius). + + A closed string can moreover wind an integer number of times around the cycle + introducing a "winding number" m + In turn this is reflected on the spectrum of the theory. - Differently from field theory, shrinking the radius does not decouple the modes, but the compact dimension remains. - In fact the spectrum of theories compactified on a small radius or a large radius does not change. + Differently from field theory, shrinking the radius does not decouple the + modes, but the compact dimension remains. In other words exchanging R and 1/R does not modify the theory. - page 14/102 - This so called T-duality can also be applied to open strings with a different outcome. - At the level of the fields, T-duality switches the sign of the right modes. - Since open strings cannot wind around the compact dimension, the behaviour of the spectrum is as in field theory: the compact dimension decouples and the open string is constrained on a lower dimensional surface. - This is the result of introducing Dirichlet boundary conditions on the T-dual coordinate, meaning that the endpoints of the string have to reside on the same surface. - The procedure can be applied to several dimensions thus introducing surfaces on which the endpoints of the open string live, called D-branes. + This so called T-duality can also be applied to open strings with a different + outcome. + + Since open strings cannot wind around the compact dimension, the behaviour of + the spectrum is as in field theory: the compact dimension decouples and the + open string is constrained on a lower dimensional surface. + + This is the result of introducing Dirichlet boundary conditions on the T-dual + coordinate, meaning that the endpoints of the string have to reside on the + same surface. + + The procedure can be applied to several dimensions thus introducing surfaces + on which the endpoints of the open string live, called D-branes. - page 15/102 - D-branes naturally introduce preferred directions of motion thus breaking the D-dimensional Poincaré symmetry. - This is also seen at the level of the content of the theory. - It is in fact possible to show that in D-dimensions, the open string sector at massless level contains an Abelian field. - D-branes split the components into a lower dimensional U(1) field on the D-brane and a vector of spacetime scalars. + D-branes naturally introduce preferred directions of motion thus breaking the + D-dimensional Poincaré symmetry. + + It is in fact possible to show that in D-dimensions, the open string sector + at massless level contains an Abelian field. + + D-branes split the components into a lower dimensional U(1) field on the + D-brane and a vector of spacetime scalars. - page 16/102 - In general we have strings whose endpoints are on the same D-brane leading to U(1) gauge theories as well as stretched strings across different branes. - The Chan-Paton factors can be used to label the endpoints of the strings by the position on the D-branes. - It is finally possible to show that when the D-branes are coincident, states can be rearranged into enhanced gauge grooups (for instance, unitary), thus creating non Abelian gauge theories. + In general we can have strings whose endpoints are on the same D-brane + leading to U(1) gauge theories as well as stretched strings across different + branes. + + The Chan-Paton factors can be used to label the endpoints of the strings by + the position on the D-branes. + + It is then possible to show that when the D-branes are coincident, states can + be rearranged into enhanced gauge groups (for instance, unitary), thus + creating non Abelian gauge theories. - page 17/102 - However, physical constraints on the possible constructions pose serious questions on the D-brane dispositions. - A quark can be modelled with a string stretched across two distant branes, but such quark would have a mass proportional to the distance to the branes: chirality in particular would not be possible to define, while being one of the defining features of the Standard Model. - This is where the possibility to put D-branes at an angle with respect to each other becomes crucial. - While most of the modes will indeed gain a mass, the massless spectrum can support chiral states localised at the intersections. + However, physical constraints on the possible constructions pose serious + questions on the D-brane dispositions. + + A quark can be modelled with a string stretched across two distant branes, + but such quark would have a mass proportional to the distance to the branes: + chirality in particular would not be possible to define, while being one of + the defining features of the Standard Model. + + This is where the possibility to put D-branes at an angle with respect to + each other becomes crucial. + + While most of the modes will indeed gain a mass, the massless spectrum can + support chiral states localised at the intersections. - page 18/102 - In this framework we consider models built with D6-branes in such a way that four dimensional Minkowski is filled, thus preserving the "physical" Minkowski invariance. - We then embed them in 6-dimensional space (we do not worry about compactification for now). - In this scenario we study correlators of fields in the presence of twist fields, which are a set of conformal fields arising at the intersections. - These are particularly interesting for phenomenological computations: for instance Yukawa couplings from the string theory definition. + In this framework we consider models built with D6-branes. + We embed them in 6-dimensional internal space without worrying about + compactification for now. + + In this scenario we study correlators of fields in the presence of twist + fields, which are a set of conformal fields arising at the intersections. + These are particularly interesting to compute for instance Yukawa couplings + in string theory. - page 19/102 - Using the path integral approach, correlators involving twist fields are dominated by the instanton contribution of the classical Euclidean action. - We focus specifically on its computation in the case of three D-branes necessary for the Yukawa couplings. + Using the path integral approach, correlators involving twist fields are + dominated by the instanton contribution of the classical Euclidean action. + + We focus specifically on its computation in the case of three D-branes + necessary for the Yukawa couplings. - page 20/102 - The literature already takes into consideration D-branes embedded as lines in a factorised version of the internal space, where the possible relative rotations are Abelian in each plane of the space. - The contribution of the string in this case is proportional to the area of the triangle formed on the plane by the three D-branes because in fact the string is completely constrained to move on the plane and its worldsheet is in fact the planar area. + The literature already takes into consideration D-branes embedded as lines in + a factorised version of the internal space, where the possible relative + rotations are Abelian in each plane of the space. + + The contribution of the string in this case is proportional to the area of + the triangle formed on the plane by the three D-branes since the + string is completely constrained to move on the plane. - page 21/102 - In what follows we consider a non completely factorised space and we focus on the 4-dimensional part of it. + In what follows we consider a non completely factorised space and we focus on + its 4-dimensional sector. - After filling the physical 4-dimensional space, we study the remaining directions of the D-brane in 4-dimensional internal space (that is planes in four dimensions which intersect in a point) using a well-adapted frame of reference which is in general rotated with respect to global coordinates. + After filling the physical 4-dimensional space, we study the remaining + directions of the D-brane in 4-dimensional internal space using a + well-adapted frame of reference which is in general rotated with respect to + global coordinates. - The rotation is not directly an SO(4) element, but it is an element of a particular Grassmannian: in fact separately rotating the Dirichlet and Neumann components does not change anything (we just need to relabel the coordinates), as long as no reflections are involved. + The rotation is not directly an SO(4) element, but it is an element of a + Grassmannian: in fact separately rotating the Dirichlet and Neumann + components does not change anything (we just need to relabel the + coordinates), as long as no reflections are involved. - The rotation involved is therefore a representative of a left equivalence class of possible rotations. + The rotation involved is therefore a representative of a left equivalence + class of possible rotations. - page 22/102 - Once the geometry is specified, we then consider the open strings. + Introducing the usual conformal transformations mapping the strip to the + complex plane, the intersecting branes are mapped to the real axis. - Introducing the usual conformal transformations mapping the strip to the complex plane, the intersecting branes are mapped to the real axis. - D-branes are therefore real intervals on the space, between their intersection points, mapped to the real axis as well. + D-branes are therefore real intervals on the space, between their + intersection points. - page 23/102 - In this definition, the boundary conditions of the strings become a set of discontinuities on the real axis, one for each D-brane, and the embedding equation specifying the intersections. + In this definition, the boundary conditions of the strings become a set of + discontinuities on the real axis, one for each D-brane, and the embedding + equation specifying the intersections. - page 24/102 - Instead of dealing directly with the discontinuities, it is more suitable to introduce an auxiliary function defined over the entire complex plane by gluing functions defined on the upper plane and on the bottom plane on any arbitrary interval. - This "doubling trick" transforms the discontinuity into a monodromy factor, when looping any base point through the glued interval. + Instead of dealing directly with the discontinuities, it is more suitable to + introduce an auxiliary function defined over the entire complex plane by + gluing functions defined on the upper plane and on the bottom plane on + arbitrary interval. + + This "doubling trick" transforms the discontinuity into a monodromy factor, + when looping any base point through the glued interval. - page 25/102 - The fact that rotations are non Abelian leads to two different monodromies for a base points starting in the upper plane or in the bottom plane. + The fact that rotations are non Abelian leads to two different monodromies + for a base points starting in the upper plane or in the bottom plane. + This is a general feature, but the 3 D-branes case simplifies this enough. - page 26/102 Dealing with these 4 x 4 matrices is delicate. - Using a known isomorphism between SO(4) matrices and SU(2) x SU(2), the monodromy matrix can be cast into a tensor product of two 2 x 2 matrices. - The matrices completely encode the rotations of the D-branes: the solution to the string propagation is therefore given by a holomorphic function (due to the equations of motion), satisfying these boundary conditions for each of the three D-branes we consider. - In this matrix form we therefore look for a tensor product of 2-dimensional basis of holomorphic functions with three distinct regular (Fuchsian) monodromies. + + Using a known isomorphism between SO(4) matrices and two copies of SU(2) x + SU(2), the monodromy matrix can be cast into a tensor product of two 2 x 2 + matrices. + In this matrix form the solution is therefore given by 2-dimensional basis of + holomorphic functions with three regular singular points. - page 27/102 - The usual SL(2,R) invariance allows us to fix the monodromies in 0, 1 and infinity. + The usual SL invariance allows us to fix the monodromies in 0, 1 and + infinity. + The overall solution will then be a superposition of all possible basis. - Given the previous properties we therefore look for a basis of Gauss hypergeometric functions. + + Given the previous properties we therefore look for a basis of Gauss + hypergeometric functions. - page 28/102 - Since we deal with rotations, the parameters of the hypergeometric functions involved are indeed connected to the parameters of the rotation (that is the rotation vectors). - However the choice is not unique and labelled by the periodicity of the rotations. - The superposition of the solutions is however still not the final result. + Since we deal with rotations, the parameters of the hypergeometric functions + involved are indeed connected to the rotation vectors. + + However the choice is not unique and labelled by the + periodicity of the rotations. - page 29/102 - The reason is a huge redundancy in the description: using the free parameters of the rotations we should in fact fix all degrees of freedom in the solution, which at the moment is an infinite sum involving an infinite amount of free parameters. + The reason is a huge redundancy in the description: using the free parameters + of the rotations we should in fact fix all degrees of freedom in the + solution, which at the moment is an infinite sum involving an infinite amount + of free parameters. - For the moment we only showed that the rotation matrix is equivalent to a monodromy matrix from which can build an overparametrised solution. + In fact we only showed that the rotation matrix is equivalent to a + monodromy matrix from which can build an overparametrised solution. - Using contiguity relations we can then restrict the sum over independent functions (that is functions which cannot be written as rational functions of contiguous hypergeometrics). + Using contiguity relations we can then restrict the sum over independent + functions. - Finally requiring the Euclidean action to be finite restricts the sum to only two terms (the particular terms surviving in the sum depend on the rotation vectors but they are never more than two). + Finally requiring the Euclidean action to be finite restricts the sum to only + two terms (the particular terms surviving in the sum depend on the rotation + vectors but they are never more than two). - Imposing the boundary conditions (that is fixing the intersection points) fixes the free constants in the solution. + Fixing the intersection points eventually determines the free constants in the solution. - page 30/102 - The physical interpretation of the solution is finally straightforward in the Abelian case, where the action can be reduced to the sum of the areas of the internal triangles (this is a general result even for a generic number of D-branes). + The physical interpretation of the solution is finally straightforward in the + Abelian case, where the action can be reduced to the sum of the areas of the + internal triangles (this is a general result even for a generic number of + D-branes). - page 31/102 - In the non Abelian case we considered there is no simple way to write the action using global data. - However the contribution to the Euclidean action is larger than the Abelian case: the strings are in fact no longer constrained on a plane and, in order to stretch across the boundaries, they have to form a small bump while detaching from the D-brane. - The Yukawa coupling in this case is therefore suppressed with respect to the Abelian case. + In the non Abelian case we considered there is no simple way to write the + action using global data. + However the contribution to the Euclidean action is larger than the Abelian + case: the strings are in fact no longer constrained on a plane and, in order + to stretch across the boundaries, they have to form a small bump while + detaching from the D-brane. + + The Yukawa coupling in this case is therefore suppressed with respect to the + Abelian case. - page 32/102 - We then turn the attention to fermions and the computation of correlators involving spin fields. - Though ideally extending some ideas, we abandon the intersecting D-brane scenario, and we introduce point-like defects on one boundary of the superstring worldsheeet in its time direction in such a way that the superstring undergoes a change of its boundary conditions when meeting a defect. + We then turn the attention to fermions and the computation of correlators + involving spin fields. + Though ideally extending some ideas, we abandon the intersecting D-brane + scenario, and we introduce point-like defects on one boundary of the + worldsheet in such a way that the superstring undergoes a change of its + boundary conditions when meeting a defect. - page 33/102 - It is possible to show that in this case the Hamiltonian of the theory develops a time dependence since it is in fact conserved only between consecutive defects. + It is possible to show that in this case the Hamiltonian of the theory + is only piecewise conserved. - page 34/102 - Suppose now that we could expand the field on a basis of solutions to the boundary conditions and work, as before, on the entire complex plane. + Suppose now that we could expand the field on a basis of solutions to the + boundary conditions and work, as before, on the entire complex plane. - page 35/102 - Ideally we would be interested in extracting the modes in order to perform any computation of amplitudes. - The definition of the operation is connected to a dual basis whose form is completely fixed by the original field (which we know) and the request of time independence. + Ideally we would be interested in extracting the modes in order to perform + any computation of amplitudes. + The definition of the operation is connected to a dual basis whose form is + completely fixed by the original field (which we know) and the request of + time independence. - page 36/102 - The resulting algebra of the operators is in fact defined through such operation and it is therefore time independent. + The resulting algebra of the operators is in fact defined through such + operation and it is therefore time independent. - page 37/102 - Differently from what done in the bosonic case, we focus on U(1) boundary change operators. + Differently from what done in the bosonic case, we focus on U(1) boundary + change operators. The resulting monodromy on the complex plane is therefore a phase factor. - page 38/102 - As in the previous case we can write a basis of solutions which incorporates the behaviour looping the point-like defects. + As in the previous case we can write a basis of solutions which incorporates + the behaviour looping the point-like defects. Consequently we can also define a dual basis. - Notice that both fields are defined up to integer factors, since we are still dealing with rotations. + + Both fields are defined up to integer factors, since we are still dealing + with rotations. - page 39/102 - In order to compute amplitudes we then need to define the space on which the representation of the algebra acts. - We define an excited vacuum, annihilated by positive frequency modes, and the lowest energy vacuum (from the strip definition). + In order to compute amplitudes we then need to define the space on which the + representation of the algebra acts. + We define an excited vacuum, annihilated by positive frequency modes, and the + lowest energy vacuum (from the strip definition). - page 40/102 - Vacua need to be consistent, leading to conditions labelled by an integer factor L relating the basis of solutions with its dual (and ultimately the algebra of operators). - In fact the vacuum should always be correctly normalised and the description of physics using any two of the vacuum definitions should be consistently equivalent. + Vacua need to be consistent, leading to conditions labelled by an integer + factor relating the basis of solutions with its dual (and ultimately the + algebra of operators). + The vacuum must always be correctly normalised and the description of physics + using any two of the vacuum definitions should be consistently equivalent. - page 41/102 - To avoid having overlapping in- and out-annihilators, the label L must vanish. + To avoid having overlapping in- and out-annihilators, the label L must + vanish. - page 42/102 - In this framework, the stress energy tensor displays as expected a time dependence due to the presence of the point-like defects. - Specifically it shows that in each defect we have a primary boundary changing operators (whose weight depends on the monodromy) and which creates the excited vacuum from the invariant vacuum. + In this framework, the stress energy tensor displays as expected a time + dependence due to the presence of the point-like defects. + Specifically it shows that in each defect we have a primary boundary changing + operators (whose weight depends on the monodromy) and which creates the + excited vacuum from the invariant vacuum. This is by all means an excited spin field. - Moreover the first order singularities display the interaction between pairs of excited spin fields. - Finally (and definitely fascinating), the stress energy tensor obeys the canonical OPE, that is the theory is still conformal (even though there is a time dependence). + + Finally (and definitely fascinating), the stress energy tensor obeys the + canonical OPE, that is the theory is still conformal (even though there is a + time dependence). - page 43/102 - In formulae, the excited vacuum used in computations is thus created by a radially ordered product of excited spin fields hidden in the defects. + In formulae, the excited vacuum used in computations is thus created by a + radially ordered product of excited spin fields hidden in the defects. - page 44/102 - We are therefore in a position to compute the correlators involving such spin fields (however since we cannot compute the normalisation, we can compute only quantities not involving it). - For instance we reproduce the known result of bosonization where the boundary changing operator is realised through the exponential of a different operator. + We are therefore in a position to compute the correlators involving such spin + fields (however since we cannot compute the normalisation, we can compute + only quantities not involving it). + For instance we reproduce the known result of bosonization. - Moreover, since we have complete control over the algebra of the fermionic fields, we can also compute any correlator involving both spin and matter fields. + Moreover, since we have complete control over the algebra of the fermionic + fields, we can also compute any correlator involving both spin and matter + fields. - page 45/102 - We therefore showed that semi-phenomenological models need the ability to compute correlators involving twist and spin fields. + We therefore showed that semi-phenomenological models need the ability to + compute correlators involving twist and spin fields. - We then introduced a framework to compute the instanton contribution to the correlators using intersecting D-branes and we showed how to compute correlators in the fermionic case involving spin fields as point-like defects on the string worldsheet. + We then introduced a framework to compute the instanton contribution to the + correlators using intersecting D-branes and we showed how to compute + correlators in the fermionic case involving spin fields as point-like defects + on the string worldsheet. - The question would now be how to extend this to non Abelian spin fields and, most importantly, to twist fields, where there is no framework such as bosonization. + The question would now be how to extend this to non Abelian spin fields and, + most importantly, to twist fields, where there is no framework such as + bosonization. - page 46/102 - After considering defects and singular points in particle physics, we analyse time dependent singularities in cosmology. + After considering defects and singular points in particle physics, I will + deal with time dependent singularities in cosmology. - page 47/102 - As string theory is considered a theory of everything, its phenomenological description should in fact include both strong and electroweak forces as well as gravity. + The reason is that as string theory is considered a theory of everything, its + phenomenological description should in fact include both strong and + electroweak forces as well as gravity. - page 48/102 - In particular from the gravity and cosmology side, we would like to have a better view of the cosmological implications in string theory. + In particular from the gravity side, we would like to have a better view of + the cosmological implications in string theory. - page 49/102 - For instance we could try to study Big Bang models to gain some better insight with respect to field theory. + For instance we could try to study Big Bang models to gain some better + insight with respect to field theory. - page 50/102 - For this, one way would be to build toy models of singularities in time, in which the singular point exists in one specific moment, rather than place. + For this, one way would be to build toy models of singularities in time, in + which the singular point exists in one specific moment, rather than place. - page 51/102 - A simple way to make it so is to build toy models from time-dependent orbifolds which can model singluarities as their fixed points. + A simple way to make it so is to build toy models from time-dependent + orbifolds which can model singularities as their fixed points. - page 52/102 - In the literature we can already find the computation of amplitudes (mainly closed strings, since we are dealing with gravitational interactions). - The presence of divergences in N-point correlators is however usually associated to a gravitational backreaction due to exchange of gravitons. + In the past people already dealt with such problem finding divergences in the + computation of amplitudes. + The presence of such divergences in N-point correlators is however usually + associated to a gravitational backreaction due to exchange of gravitons. - page 53/102 - However the 4-tachyon amplitude in string theory is divergent already in the open string sector at tree level (thus we are sure no gravitational interaction is present). + However the 4-tachyon amplitude in string theory is divergent already in the + open string sector at tree level: in other words genuine gauge theories. - The effective field theory interpretation would be a 4-point interaction of scalar fields (higher spins would only spoil the behaviour). + The effective field theory interpretation would be a 4-point interaction of + scalar fields (higher spins would only spoil the behaviour). - page 54/102 To investigate further, we consider the so called Null Boost Orbifold. - The construction starts from D-dimensional Minkowski spacetime through a change of coordinates. + + The construction starts from D-dimensional Minkowski spacetime through a + change of coordinates. - page 55/102 - The orbifold is then built through the periodic identification of one coordinate along the direction of its Killing vector. - - Notice that at this time the momentum in this direction will have to be quantized to be consistent. + The orbifold is then built through the periodic identification of one + coordinate along the direction of its Killing vector, which leads to momentum + quantization. - page 56/102 - From these identifications, we can build the usual scalar wave function obeying the standard equations of motion. + From these identifications, we can build scalar wave functions obeying the + standard equations of motion. - Notice the behaviour in the time direction u which already takes a peculiar form and the presence of the quantized momentum in a strategic place. + Notice the behaviour in the time direction u which already takes a peculiar + form and the presence of the quantized momentum in a strategic place. - page 57/102 - In order to introduce the divergence problem we first consider a theory of scalar QED. + In order to introduce the divergence problem we first consider a theory of + scalar QED. - page 58/102 - When computing the interactions between the fields, the terms involved are entirely defined by two main integrals. + When computing the interactions between the fields, the terms involved are + entirely defined by two main integrals. - It might not be immediately visible, but given the behaviour of the scalar functions, any vertex interaction with more than 3 fields diverges. + It might not be immediately visible, but given the behaviour of the scalar + functions, any vertex interaction with more than 3 fields diverges. - page 59/102 - The reason for the divergence is connected to the "strategically placed" quantized momentum. - When when all quantized momenta vanish, in the limit of small u (that is near the singularity) the integrands develop isolated zeros preventing the convergence. - In fact, in this case, even a distributional interpretation (not unlike the derivative of a delta function) fails. + The reason for the divergence is connected to the "strategically placed" + quantized momentum. + + When when all quantized momenta vanish, in the limit of small u (that is near + the singularity) the integrands develop isolated zeros preventing the + convergence. + In fact, in this case, even a distributional interpretation (not unlike the + derivative of a delta function) fails. - page 60/102 So far the situation is therefore somewhat troublesome. - In fact even the simplest theory presents divergences. - Moreover, obvious ways to regularise the theory do not work: for instance adding a Wilson line does not cure the problem as divergences also involve neutral strings which would not feel the regularisation. + Moreover, obvious ways to regularise the theory do not work: for instance + adding a Wilson line does not cure the problem as divergences also involve + neutral strings which would not feel the regularisation. - The nature of the divergence is therefore not just gravitational, but there must be something hidden. - - In fact the problems seem to arise from the vanishing volume in phase space along the compact direction: the issue looks like geometrical, rather than strictly gravitational. + In fact the problems seem to arise from the vanishing volume in phase space + along the compact direction: the issue looks like geometrical, rather than + strictly gravitational. - page 61/102 - Since the field theory fails to give a reasonable value for amplitudes involving time-dependent singularities, we could therefore ask whether string theory can shed some light. + Since the field theory fails to give a reasonable value for amplitudes + involving time-dependent singularities, we could therefore ask whether string + theory can shed some light. - page 62/102 The relevant divergent integrals are in fact present also in string theory. - They arise from interactions of massive vertices (the first massive vertex is shown here). + They arise from interactions of massive vertices like what is shown here. - These vertices are usually overlooked as they do not play in general a relevant role at low energy. - However it is possible that near the singularity they might actually give their contribution. - These vertices are involved at low energy in the definition of contact terms (that is terms which do not involve exchange of vector bosons) in the effective field theory, which therefore is lacking their definition. + These vertices are usually overlooked as they do not play in general a + relevant role at low energy. + However it is possible that near the singularity they might actually give + their contribution. + These vertices are involved at low energy in the definition of contact terms + in the effective field theory, which therefore does not account for them. - page 63/102 In this sense even string theory cannot give a solution to the problem. - In other words since the effective theory does not even exist, its high energy completion is not capable of providing a better description. + In other words since the effective theory does not even exist, its high + energy completion is not capable of providing a better description. - page 64/102 There is however one geometric way to escape this. - Since the issues are related to a vanishing phase space volume, analytically speaking it is sufficient to add a non compact direction to the orbifold in which the particle is "free to escape". + Since the issues are related to a vanishing phase space volume, analytically + speaking it is sufficient to add a non compact direction to the orbifold in + which the particle is "free to escape". - page 65/102 - While the Generalised Null Boost Orbifold has basically the same definition through one of its Killing vector, the presence of the additional direction acts in a different way on the definition of the scalar functions. - As you can see the new time behaviour ensures better convergence properties, and the presence of the continuous momentum ensures that no isolated zeros are present at any time. - In fact even in the worst case scenario, the arising amplitudes would still have a distributional interpretation. + While the Generalised Null Boost Orbifold has basically the same definition + through one of its Killing vector, the presence of the additional direction + acts in a different way on the definition of the scalar functions. + + As you can see the new time behaviour ensures better convergence properties, + and the presence of the continuous momentum ensures that no isolated zeros + are present at any time. + Even in the worst case scenario, the arising amplitudes would still have a + distributional interpretation. - page 66/102 - We therefore showed that divergences in the simplest theories are present both in field theory and string theory and that in the presence of singularities, the string massive states start to play a role. + We therefore showed that divergences in the simplest theories are present + both in field theory and string theory and that in the presence of + singularities, the string massive states start to play a role. - The nature of the divergences is however due to vanishing volumes in phase space and cannot be classified as simply a gravitational backreaction. - In fact the introduction of "escape routes" for fields grants a distributional interpretation of the amplitudes. + The nature of the divergences is however due to vanishing volumes in phase + space and cannot be classified as simply a gravitational backreaction. + In fact the introduction of "escape routes" for fields grants a + distributional interpretation of the amplitudes. - It is also possible to show that this is not restricted to "null boost" types of orbifolds, but even other kinds of orbifolds present the same issues. + It is also possible to show that this is not restricted to "null boost" types + of orbifolds, but even other kinds of orbifolds present the same issues. - page 67/102 - In summary we showed that the divergences cannot be regarded as simply gravitational, but even gauge theories (that is the open sector of the string theory) present issues. + In summary we showed that the divergences cannot be regarded as simply + gravitational, but even gauge theories (that is the open sector of the string + theory) present issues. - Their nature is however subtle and connected to the interaction of string massive modes (or contact terms in the low energy formulation) which are not usually studied in detail. + Their nature is however subtle and connected to the interaction of string + massive modes (or contact terms in the low energy formulation) which are not + usually taken into account. - page 68/102 - We finally move to the last part involving tools for phenomenology in string theory. + We finally move to the last section. - After the analysis of semi-phenomenological analytical models, we now consider a computational task related to compactifications of extra-dimensions using machine learning. + After the analysis of semi-phenomenological analytical models, we now + consider a computational task related to compactifications of + extra-dimensions using machine learning. - page 69/102 We focus on Calabi-Yau manifolds in three complex dimensions. - Due to their properties and their symmetries, the relevant topological invariants are two Hodge numbers: they are integer numbers and in general can be difficult to compute. + Due to their properties and their symmetries, the relevant topological + invariants are two Hodge numbers. - As the number of possible Calabi-Yau 3-folds is an astonishingly huge number, we focus on a subset. + As the number of possible compact Calabi-Yau 3-folds is a huge, we focus on a + subset. - page 70/102 - Specifically we focus on manifolds built as intersections of hypersurfaces in projective spaces, that is intersections of several homogeneous equations in the complex coordinates of the manifold. + Specifically we focus on manifolds built as intersections of hypersurfaces in + projective spaces, that is intersections of several homogeneous equations in + the complex coordinates of the manifold. - As we are interested in studying these manifolds as topological spaces, for each equation and projective space we do not care about the coefficients, but only the exponents, or better the degree of the equation in a given coordinate. - The intersection is complete in the sense that it is non degenerate. + As we are interested in studying these manifolds as topological spaces, for + each equation and projective space we do not care about the coefficients, but + only the exponents, or better the degree of the equation in a given + coordinate. - page 71/102 - The intersections can be generalised to multiple projective spaces and equations and the manifold can be characterised by a matrix containing the powers of the coordinates in each equation. + The intersections can be generalised to multiple projective spaces and + equations and the manifold can be characterised by a matrix containing the + powers of the coordinates in each equation. - The problem in which we are interested is therefore to be able to take the so called "configuration matrix" of the manifolds and predict the value of the Hodge numbers. - Formally this is a map from a matrix to a natural number. + The problem in which we are interested is therefore to be able to take the so + called "configuration matrix" of the manifolds and predict the value of the + Hodge numbers. - page 72/102 - The real issue is now how to treat the configuration matrix and how to build such map. + The real issue is now how to treat the configuration matrix and how to build + such map. - page 73/102 We use a machine learning approach. - In very simple words it means that we want to find a new representation of the input (possibly parametrized by some weights which we can tune and control) such that the predicted Hodge numbers are as close as possible to the correct result. + In very simple words it means that we want to find a new representation of + the input (possibly parametrized by some weights which we can tune and + control) such that the predicted Hodge numbers are as close as possible to + the correct result. - In this sense the machine has to learn some way to transform the input to get a result close to what in the computer science literature is called the "ground truth". + In this sense the machine has to learn some way to transform the input to get + a result close to what in the computer science literature is called the + "ground truth". - The measure of proximity or distance is called "loss function" or "Lagrangian function" (with a slight abuse of naming conventions). - The machine then learns some way to minimise this function (for instance using gradient descent methods and updating the previously mentioned weights). + The measure of proximity or distance from the true value is called "loss + function" or "Lagrangian function" (with a slight abuse of naming + conventions). + The machine then learns some way to minimise this function (for instance + using gradient descent methods and updating the parameters). - page 74/102 - We thus exchange the difficult problem of finding an analytical solution with an optimisation problem (it does not imply "easy", but it is at least doable). + We thus exchange the difficult problem of finding an analytical solution with + an optimisation problem (it does not imply "easy", but it is at least + doable). - page 75/102 - In order to learn the best way of doing this, we can rely on a vast computer science literature and use large physics datasets containing lots of samples from which to infer a structure. + In order to learn the best way of doing this, we can rely on a vast computer + science literature and use large physics datasets containing lots of samples + from which to infer a structure. - page 76/102 - In this sense the approach can merge techniques from physics, mathematics and computer science benefiting from advancements in all fields. + In this sense the approach can merge techniques from physics, mathematics and + computer science benefiting from advancements in all fields. - page 77/102 - The approach can furthermore provide a good way to analyse data and infer structure and advance hypothesis, which could end up overlooked using traditional brute force algorithms. + The approach can furthermore provide a good way to analyse data and infer + structure and advance hypothesis, which could end up overlooked using + traditional brute force algorithms. - In this case we focus on the prediction of two Hodge numbers with very different distributions and ranges. - The data we consider were computed using top of the class computing power at CERN in the 80s, with a huge effort by the string theory community. - In this sense Complete Intersection Calabi-Yau manifolds are a good starting point to investigate the application of machine learning techniques because they are well studied and characterised. + In this case we focus on the prediction of two Hodge numbers with very + different distributions and ranges. + The data we consider were computed using top of the class computing power at + CERN in the 80s, with a huge effort by the string theory community. + + In this sense Complete Intersection Calabi-Yau manifolds are a good starting + point to investigate the application of machine learning techniques because + they are well studied and characterised. - page 78/102 - The dataset we use contains less than 10000 manifolds (in machine learning terms it is still small). + The dataset we use contains less than 10000 manifolds (in machine learning + terms it is still small). - From these we remove product spaces (recognisable by their block diagonal form of the configuration matrix) and we remove very high values of the Hodge numbers from training to avoid learning "extremal configurations". + From these we remove product spaces (recognisable by their block diagonal + form of the configuration matrix) and we remove very high values of the Hodge + numbers to avoid learning "extremal configurations". + Mind that we only remove them from the training data which the machine + actually uses to learn. - In this sense we are simply not feeding the machine "extremal" configurations in an attempt to push as far as possible the application: should the machine learn a good representation, it will automatically be capable of learning also those configurations without a human manually feeding them. + In this sense we are simply not giving the machine "extremal" configurations + in an attempt to push as far as possible the application: should the machine + learn a good representation, it will automatically be capable of learning + also those configurations without a human manually feeding them. - We then define three separate folds: the largest contains training data used by the machine to adjust the parametrisation, 10% of the data is then used for intermediate evaluation of the process (for instance to avoid the machine to overfit the data in the training set), while the last subset is used to give the final predictions. - Differently from the validation set, the test set has not been seen by the machine and therefore can reliably test the generalisation ability of the algorithm. + We then define three separate folds: the largest contains training data used + by the machine to adjust the parametrisation, 10% of the data is then used + for intermediate evaluation of the process, while the last subset is used to + give the final predictions. - Differently from previous approaches we consider this as a regression task in the attempt to let the machine learn a true map between the configuration matrix and the Hodge numbers (in case we can also discuss the classification approach as it has some interesting applications itself). + Differently from the validation set, the test set has not been seen by the + machine and therefore can reliably test the generalisation ability of the + algorithm. + + Differently from some previous approaches we consider this as a regression + task in the attempt to let the machine learn a true map between the + configuration matrix and the Hodge numbers (in case we can also discuss the + classification approach as it has some interesting applications itself). - page 79/102 - The distributions of the Hodge numbers therefore present less outliers than the initial dataset, but as you can see we expect the result to be similar even without the procedure, since the number of outliers removed is small. + The pruned distribution of the Hodge numbers therefore presents less outliers + than the initial dataset, but as you can see we expect the result to be + similar even without the procedure, since the number of outliers removed is + small. - In fact we also proved it and if anyone is interested we can also discuss a different more "machine learning accurate" approach to the task that we adopted. + This first analysis however proved to be a good success to get higher + results. - page 80/102 - The pipeline we adopt is the same used at industrial level by companies and data scientists. - We in fact heavily rely on data analysis to improve as much as possible the output. + The pipeline we adopt is the same used at industrial level by companies and + data scientists. + We in fact heavily rely on data analysis to improve as much as possible the + output. - page 81/102 - This for instance can be done by including additional information with respect to the configuration matrix, that is by feeding the machine variables which can be manually derived: by definition they are redundant but can be used to easily learn a pattern. + This for instance can be done by including additional information with + respect to the configuration matrix, that is by feeding the machine variables + which can be manually derived: by definition they are redundant but can be + used to easily learn a pattern. - In fact as we can see most of the features such as the number of projective spaces or the number of equations in the matrix are heavily correlated with the Hodge numbers. + In fact as we can see most of the features such as the number of projective + spaces or the number of equations in the matrix are heavily correlated with + the Hodge numbers. - Moreover even using algorithms to produce a ranking of the variables such as decision trees show that such "engineered features" are much more important than the configuration matrix itself. + Moreover using algorithms to produce a ranking of the variables such as + decision trees shows that such "engineered features" are much more important + than the configuration matrix itself. - page 82/102 Using the "engineered data", we now get to the choice of the algorithm. - There is no general rule in this, even though there might be good guidelines to follow. + There is no general rule for this, even though there might be good guidelines + to follow. - page 83/102 - Though the approach is clearly "supervised" in the sense that the machine learns by approximating a known result, we also tried other approaches in an attempt to generate additional information which the machine could use. + Though the approach is clearly "supervised" in the sense that the machine + learns by approximating a known result, we also tried other approaches in an + attempt to generate additional information which the machine could use. - The first approach is a clustering algorithm, intuitively used to look for a notion of "proximity" between the configuration matrices. + The first approach is a clustering algorithm, intuitively used to look for a + notion of "proximity" between the configuration matrices. This however did not play a role in the analysis. - The other is definitely more interesting and it consists in finding a better representation of the configuration matrix using less components. - The idea is therefore to "squeeze" or "concentrate" the information in a lower dimensional space (matrices in our case have 180 components, so we are trying to aim for something less than that). + The other is definitely more interesting and it consists in finding a better + representation of the configuration matrix using less components. + The idea is therefore to "squeeze" or "concentrate" the information in a + lower dimensional space (matrices in our case have 180 components, so we are + trying to aim for something less than that). - page 84/102 - For the predictions we first relied on traditional regression algorithms, such as linear models, support vector machines and boosted decision trees. - I will not enter into the details and differences between the algorithms, but we can indeed discuss them. + For the predictions we first relied on traditional regression algorithms, + such as linear models, support vector machines and boosted decision trees. + I will not enter into the details and differences between the algorithms, but + we can indeed discuss them. - page 85/102 - Let me however say a few words a dimensionality reduction procedure known as "principal components analysis" (or PCA for short), since this is going to be part of my future. + Let me however say a few words a dimensionality reduction procedure known as + "principal components analysis" (or PCA for short), since this proved to be + an important step in the analysis. - Suppose that we have a rectangular matrix (which could be number of samples in the dataset times the number of components of the matrix once it has been flattened). + Suppose that we have a rectangular matrix (which could be number of samples + in the dataset times the number of components of the matrix once it has been + flattened). - The idea is to project the data onto a lower dimensional surface where the variance if maximised in order to retain as much information as possible. + The idea of PCA is to project the data onto a lower dimensional surface where + the variance if maximised in order to retain as much information as possible. This is usually used to isolate a signal from a noisy background. - Thus by isolating only the meaningful components of the matrix we can hope to help the algorithm. + Thus by isolating only the meaningful components of the matrix we can hope to + help the algorithm. - page 86/102 - Visually PCA is used to isolate the eigenvalues and eigenvectors of the covariance matrix (or the singular values of the matrix) which do not belong to the background. + Visually PCA is used to isolate the eigenvalues and eigenvectors of the + covariance matrix (or the singular values of the matrix) which do not belong + to the background. - From random matrix theory we know that the eigenvalues of an independently and identically distributed matrix (a Wishart matrix) follow a Marchenko-Pastur distribution. + From random matrix theory we know that the eigenvalues of an independently + and identically distributed matrix (a Wishart matrix) follow a + Marchenko-Pastur distribution. - Such matrix containing a signal would therefore be recognised by the presence of eigenvalues outside this probability distribution. + Such matrix containing a signal would therefore be recognised by the presence + of eigenvalues outside this probability distribution. We could therefore simply keep the corresponding eigenvectors. - In our case this resulted in an improvement of the accuracy, obtained by retaining less than half of the components of the matrix (corresponding to 99% of the variance of the initial set). + In our case this resulted in an improvement of the accuracy, obtained by + retaining less than half of the components of the matrix (corresponding to + 99% of the variance of the initial set). - page 87/102 As we can see we used several algorithms to evaluate the procedure. - Previous approaches in the literature mainly relied on the direct application of algorithms to the configuration matrix. - We extended this beyond the previously considered algorithms (mainly support vectors) to decision trees and linear models for comparison. + + Previous approaches in the literature mainly relied on the direct application + of algorithms to the configuration matrix. + We extended this beyond the previously considered algorithms (mainly support + vectors) to decision trees and linear models for comparison. - page 88/102 - Techniques such as feature engineering and PCA provide a huge improvement (even with less training data). - Let me for instance point out the fact the even a simple linear regression reaches the same level of accuracy previously reached by more complex algorithms, even with much less training data. + Techniques such as feature engineering and PCA provide a huge improvement + (even with less training data). + Let me for instance point out the fact the even a simple linear regression + reaches the same level of accuracy previously reached by more complex + algorithms, even with much less training data. + This ultimately can cut computational costs and complexity. - page 89/102 - However this does not conclude the landscape of algorithms used in machine learning. + However this does not conclude the landscape of algorithms used in machine + learning. In fact we also used neural networks architectures. - They are a class of function approximators which use (some variants of) gradient descent to optimise the weights. - Their layered structure is key to learn highly non linear and complicated functions. + They are a class of function approximators which use (some variants of) + gradient descent to optimise the weights. + Their layered structure is key to learn highly non linear and complicated + functions. We focused on two distinct architectures. - The older fully connected networks were employed in previous attempts at predicting the Hodge numbers. - They rely on a series of matrix operations to create new outputs from previous layers. - In this sense the matrix W and the bias term b are the weights which need to be updated. - Each node is connected to all the outputs, hence the name fully connected or densely connected (or equivalently the matrix W does not have vanishing entries). - To learn non linear functions this is however not sufficient: an iterated application of these linear maps would simply result in a linear function to be learned. + The older fully connected networks were employed in previous attempts at + predicting the Hodge numbers. + They rely on a series of matrix operations to create new outputs from + previous layers. + In this sense the matrix W and the bias term b are the weights which need to + be updated. + Each node is connected to all the outputs, hence the name fully connected or + densely connected. + Equivalently this means that the matrix W does not have vanishing + entries. + + To learn non linear functions this is however not sufficient: an iterated + application of these linear maps would simply result in a linear function to + be learned. We "break" linearity by introducing an "activation function" at each layer. - The second architecture is called convolutional from its iterated application of "sliding window function" (that is convolutions) applied on the layers. + The second architecture is called convolutional from its iterated application + of "sliding window function" (that is convolutions) applied on the layers. - page 90/102 - Convolutional networks have several advantages over a fully connected approach. + Convolutional networks have several advantages over a fully connected + approach. - Since the input in this case does not need to flattened, convolutions retain the notion of vicinity between cells in a grid (here we have an example of a configuration matrix as seen by a convolutional neural network). + Since the input in this case does not need to flattened, convolutions retain + the notion of vicinity between cells in a grid (here we have an example of a + configuration matrix as seen by a convolutional neural network). - Since they do not have one weight for each connection, they have a smaller number of parameters (proportional to the size of the window) to be updated (in our specific case we cut by more than one order of magnitude the number of parameters used). + Since they do not have one weight for each connection, they have a smaller + number of parameters to update (proportional to the size of the window). + In our specific case we cut by more than one order of magnitude the number + of parameters used with respect to fully connected networks. - Moreover weights are shared by adjacent cells, meaning that if there is a structure to be inferred, this is the way to go to exploit the "artificial intelligence" underlying the operations involved. + Moreover weights are shared by adjacent cells, meaning that if there is a + structure to be inferred, this is the way to go to exploit the "artificial + intelligence" underlying the operations involved. - page 91/102 - In this sense a convolutional architecture can isolate defining features of the output and pass them to the following layer as in the animation. + In this sense a convolutional architecture can isolate defining features of + the output and pass them to the following layer as in the animation. - For instance, using a computer science analogy, this can be used to classify objects given a picture: a convolutional neural network is literally capable of isolating what makes a dog a dog and what distinguishes it from a cat (even more specific it can separate a Labrador from a Golden Retriever). + For instance, using a computer science analogy, this can be used to classify + objects given a picture: a convolutional neural network is literally capable + of isolating what makes a dog a dog and what distinguishes it from a cat + (even more specific it can separate a Labrador from a Golden Retriever). - page 92/102 - This has in fact been used in computer vision tasks in recent year for pattern recognitions, object detections and spatial awareness tasks (for instance to isolate the foreground from the background). - In this sense this is the closest approximation of artificial intelligence in supervised tasks. + This has in fact been used in computer vision tasks in recent year for + pattern recognitions, object detections and spatial awareness tasks (for + instance to isolate the foreground from the background). + In this sense this is the closest approximation of artificial intelligence in + supervised tasks. - page 93/102 - My contribution in this sense is inspired by deep learning research at Google. - In recent years they were able to devise new architectures using so called "inception modules" in which different convolution operations are used concurrently. - The architecture has better generalisation properties since more features can be detected and processed at the same time. + My contribution in this sense is inspired by deep learning research at + Google. + In recent years they were able to devise new architectures using so called + "inception modules" in which different convolution operations are used + concurrently. + The architecture has better generalisation properties since more features can + be detected and processed at the same time. - page 94/102 - In our case we decided to go for two concurrent convolutions one scanning each equation (the vertical kernel) of the configuration matrix, while a second convolutions scans the projective spaces (in horizontal). + In our case we decided to go for two concurrent convolutions one scanning + each equation (the vertical kernel) of the configuration matrix, while a + second convolutions scans the projective spaces (in horizontal). - The layer structure is then concatenated until a single output is produced (the Hodge number that is). + The layer structure is then concatenated until a single output is produced + (the Hodge number that is). - The idea is that this way the network can learn a relation between projective spaces and equations and recombine them to find a new representation. + The idea is that this way the network can learn a relation between projective + spaces and equations and recombine them to find a new representation. - page 95/102 - As we can see even the simple introduction of a traditional convolutional kernel (it was a 5x5 kernel in this case) is sufficient to boost the accuracy of the predictions (previous best results in 2018 reached only 77% of accuracy on h^{1,1}). + As we can see even the simple introduction of a traditional convolutional + kernel (the result shown was reached with a 5x5 kernel function) is + sufficient to boost the accuracy of the predictions (previous best results in + 2018 reached only 77% of accuracy on h^{1,1}). - page 96/102 - The introduction of the Inception architecture has major advantages: it uses even less parameters than "traditional" convolutional networks, it boosts the performance reaching near perfect accuracy, it needs a lot less data (even with just 30% of the data for training, the accuracy is already near perfect). + The introduction of the Inception architecture has major advantages: it uses + even less parameters than "traditional" convolutional networks, it boosts the + performance reaching near perfect accuracy, it needs a lot less data (even + with just 30% of the data for training, the accuracy is already near + perfect). - Moreover with this architecture we were able to predict also h^{2,1} with 50% accuracy: even if does not look a reliable method to predict it (I agree, for now), mind that previous attempts have usually avoided computing it, or they reached accuracies as high as 8-9% (even feature engineering could boost it only around 35%). + Moreover with this architecture we were able to predict also h^{2,1} with 50% + accuracy: even if does not look a reliable method to predict it (I agree, for + now), mind that previous attempts have usually avoided computing it, or they + reached accuracies as high as 8-9% (even feature engineering could boost it + only around 35%). - The network is also solid enough to predict both Hodge numbers at the same time: trading a bit of the accuracy for a simpler model, it is in fact possible to let the machine learn the existing relation between the Hodge numbers without specifically inputing anything (for instance by inserting the fact that the difference of the Hodge numbers is the Euler characteristic). + The network is also solid enough to predict both Hodge numbers at the same + time: trading a bit of the accuracy for a simpler model, it is in fact + possible to let the machine learn the existing relation between the Hodge + numbers without specifically inputing anything (for instance by inserting the + fact that the difference of the Hodge numbers is the Euler characteristic). - For more specific info I invite you to take a look at Harold Erbin's talk on the subject at the recent "string_data" workshop. + For more specific info I invite you to take a look at Harold's talk on the + subject at the recent "string_data" workshop. - page 97/102 - Deep learning can therefore be used conscientiously (and I cannot stress this enough) as a predictive method, provided that one is able to analyse the data (no black boxes should ever be admitted). + Deep learning can therefore be used as a predictive method, provided that one + is able to analyse the data (no black boxes should ever be admitted). - page 98/102 - It can also be used a source of inspiration for inquiries and investigations always provided a good analysis is done beforehand (deep learning is a black box in the sense that once it starts it is difficult to keep track of what is happening under the bonnet, but not because we supposedly do not know what is going on in general). + It can also be used a source of inspiration for inquiries and investigations + always provided a good analysis is done beforehand. - page 99/102 @@ -626,23 +976,34 @@ - page 100/102 - Moreover convolutional networks look promising and with a lot of unexplored potential. - This is in fact the first time in which they have been successfully used in theoretical physics. + Moreover convolutional networks look promising and with a lot of unexplored + potential. + This is in fact the first time in which they have been successfully used in + theoretical physics. - Finally, this is an interdisciplinary approach in which a lot is yet to be learned from different perspective. + Finally, this is an interdisciplinary approach in which a lot is yet to be + learned from different perspective. - page 101/102 More directions to investigate now remain. - In fact one could in principle exploit freedom in representing the configuration matrices to learn the best possible representation. + In fact one could in principle exploit freedom in representing the + configuration matrices to learn the best possible representation. - Otherwise one could start to think about this in a mathematical embedding and study what happens for CICY 4-folds (almost one million complete intersections). + Otherwise one could start to think about this in a mathematical embedding and + study what happens for CICY 4-folds (almost one million complete + intersections). - Moreover, as I was saying, this could be used as an attempt to study formal aspects of deep learning, or even more to directly dive into the "real artificial intelligence" and start to study the problem in a reinforcement learning environment where the machine automatically learns a task without knowing the final result. + Moreover, as I was saying, this could be used as an attempt to study formal + aspects of deep learning, or even more to directly dive into the "real + artificial intelligence" and start to study the problem in a reinforcement + learning environment where the machine automatically learns a task without + knowing the final result. - page 102/102 - I will therefore leave the open question as to whether this is actually going to be the end or just the start of something else. + I will therefore leave the open question as to whether this is actually going + to be the end or just the start of something else. In the meantime I thank you for your attention. diff --git a/thesis.pdf b/thesis.pdf index 2b28aa4..a01045e 100644 Binary files a/thesis.pdf and b/thesis.pdf differ diff --git a/thesis.tex b/thesis.tex index 03e3ce2..494eee9 100644 --- a/thesis.tex +++ b/thesis.tex @@ -1765,13 +1765,11 @@ \begin{columns}[T, totalwidth=\linewidth] \begin{column}{0.7\linewidth} \begin{itemize} - \item general framework for \textbf{D-branes at angles} + \item \textbf{D-branes at angles} and \textbf{defect CFT} $\quad \rightarrow \quad$ \textbf{spin and twist fields} - \item alternative computations of \textbf{correlators of spin fields} + \item \textbf{time dependent orbifolds} $\quad \rightarrow \quad$ strings and \textbf{divergences} - \item strings and divergences in \textbf{time dependent orbifolds} - - \item string compactifications and \textbf{deep learning techniques} + \item \textbf{deep learning} $\quad \rightarrow \quad$ CICY and \textbf{topological properties} \end{itemize} \end{column} \begin{column}{0.3\linewidth}