Assistente AI
Trascrizione
00:00:00I always go to.
00:00:07Learn and I think some somewhere to be detta.
00:00:22Somewhere. Where? You biologico is.
00:00:35The fact that you can have for instance in the case of emoglobina.
00:00:40You know how for change to say.
00:00:46So you have di entire complex or for change
00:00:51you can have just combination
00:00:53of change be so the two different wants,
00:00:57so you can just and you O you can have to copy on the same protein,
00:01:04sometimes they are different
00:01:08the second copy different information so they are
00:01:12the pit in the crystal and the port
00:01:16as an simmetriche asimmetriche unit of the, ovviamente.
00:01:29Years also important to distinguished what is what is
00:01:34the super position so the simple case is super position.
00:01:41I know what are the points in the first structure what are
00:01:47the points in the second structure and I
00:01:49know what corrispondo what so meaning?
00:01:54We have to object with the same number of Points and is to minimis
00:02:01the distance apply rotation and translations operations
00:02:06so the position algorithm one of them.
00:02:10The one that we so was the. So you see.
00:02:19Di di di target function just minimal distance this
00:02:22is usually generally calculator aviation.
00:02:28And ém di it simple you just
00:02:33calcolate center for I am geometriche center
00:02:38of the two side of points you translate
00:02:41one set to the other just calcolati in the distance between
00:02:46the two geometriche centers
00:02:49and once you have translated them one on top
00:02:52of the other you have just to find
00:02:54the optimal minimal this minimal Minimalisti.
00:02:58Today to this you can exploit
00:03:02and cross convergence between the two set of points
00:03:08and you can The compose this course
00:03:12usually various the composition essential I will find
00:03:16the matrix transform one set points into
00:03:21the other and optimal actions that you want you want to find.
00:03:30Questions for aliens.
00:03:35We we have to ad an additional task that is
00:03:40identifying what is the subset of points in
00:03:45the other structure that match the points or
00:03:49a subset of points in the first capture so you don't want
00:03:54general to online the two structures you want to find
00:04:00the large possible subset of
00:04:02points that are similar between the two structures,
00:04:06of course this is not possible.
00:04:09This is not easy because in
00:04:11principle you should you could compared all points
00:04:15versus on the other points in the structure or maybe there are
00:04:21multiple sub region in the structures
00:04:24that line with a partner structures,
00:04:27but maybe that something in the middle
00:04:30like and insertion or addition so
00:04:33there is seen various algorithm to the problem and.
00:04:42The first man introduce the concept
00:04:47of contact maps because some of them uses us contact maps,
00:04:52instead of coordinate so contact maps Thank you know everything.
00:04:57So what is this matrix and you can simply
00:05:00field just playing cut off and
00:05:05this comparison of all atom versus atom of
00:05:11the protein so it's simmetriche and just this is in Amstrong.
00:05:18When you can contact map you can
00:05:21the second elements you member alpha is.
00:05:29Close to the main diagonal because this is are made
00:05:35of contacts between close amminoacidi
00:05:40in the sequence and the diagonal present
00:05:44consecutive used if you
00:05:48get away a little bit from the main diagonal.
00:05:52You are evaluate Aminoacidi.
00:05:55Close to each other in the sequence
00:05:59if you see contact away from the main diagonal.
00:06:05You are finding contact between aminoacidi
00:06:08very distant in the sequence and you can
00:06:13continuous stretch of contacts and
00:06:16sometimes present bit sheet Antipatro bit sheets
00:06:21when you are your form contacts between
00:06:25one stretch and stretch somewhere in the sequence,
00:06:30but in the opposite direction and so you
00:06:33have to see you can have you have
00:06:37a block that is interacting with another block
00:06:41in the sequence and with the same orders.
00:06:51So. Have to find
00:06:56subset to possible number of equivalent points.
00:07:03You see within small nations so you have to the side
00:07:08with you consider as an accept table threshold to say
00:07:13a set points are equivalent and usually you see you
00:07:18need to you have to set points that are six away.
00:07:25They are not that similar
00:07:29so you can do that we so for different is.
00:07:39One is.
00:07:45Based on dynamic programming no
00:07:49daily based on this is so comparison of
00:07:54distance sys combinatoria extension
00:07:59and idea was to compared fragments and then
00:08:04construct what is extend those fragments to cover as
00:08:08possible di the Siemens and that is
00:08:13an euristiche that exploits
00:08:16another distance measure that not to be that
00:08:22actually is less sensitive to
00:08:25outliers so if you find the good element and you have
00:08:29a few positions that are you are
00:08:32not that because of this is positions,
00:08:37but you are you them so it's
00:08:42its superior performance in identifying related structures.
00:08:54You can look at the details bit more than the slides,
00:09:01but what you have to the member is the fact that you
00:09:04have been in class for different types of a algorithm.
00:09:10They are more effective pending on
00:09:14the situation and they are using different strategies.
00:09:18So the idea is for you to the member that
00:09:23you can use different strategies to align to structures.
00:09:28We are different and their.
00:09:34Questions.
00:09:41Okay, so so super position with so.
00:09:47What you have structural and?
00:09:51and when you have a lot of
00:09:55your databases when you start up to structures you can
00:10:00start making a hypothesis about the evolution of proteins as we
00:10:06so about evolution and we
00:10:11started from one important observation that is.
00:10:18That for similar structures you can see
00:10:23any level of sequence identity
00:10:28between the two so you can have cases,
00:10:30which sequence identity is very as well as is in the structure of
00:10:36the same sequence identity is now so the diesel di.
00:10:44Di di behind this is of course
00:10:48and intuition that is that we have to see is
00:10:52that are similar also the structures to be
00:10:55similar is different aspect
00:10:59also the structures to be different different.
00:11:03This is not the case for protein.
00:11:05Because if you have to structures that are very similar.
00:11:11You also have cases in with the sequence is very different.
00:11:16How is that possible and how we can exploits?
00:11:19This or making evolution is about proteins,
00:11:25while we started to the same exercise,
00:11:28but using place of proteins that for have
00:11:32different structures and what you see here that
00:11:36you will never see any of proteins with different structure and
00:11:43high sequence identity or better threshold About which if di am.
00:11:54The sequence identity is that that we can
00:12:00say that the structure are similar so this is what this plot.
00:12:07This experiment is in us so you will
00:12:09never find different structures with
00:12:13sequence identity higher than
00:12:16so if sequence against the database and I.
00:12:21With at least not for the identity.
00:12:29I can say that they have the same structure of the
00:12:34same from the same overall shape of the dimension information and
00:12:39so based on this you can actually do
00:12:45not you can actually Transfer
00:12:48information to a cross between proteins.
00:12:54And super important.
00:12:56This is the basis of modern.
00:12:59Bioinformatica is the basis of modeling,
00:13:04the fuck that you can make it is about
00:13:07evolution and indeed one important aspects.
00:13:13Is one to one you have you can sequence so
00:13:22sequence divergenze so the hypothesis is
00:13:25that the important aspect for
00:13:29proteins to function is the structure.
00:13:32If you break the structure you are sort
00:13:35of breaking the functional of protein so what
00:13:39you can during the evolution is
00:13:41that two proteins in two different species
00:13:44can accumulate mutation so their sequence is that.
00:13:51And this mutation are accumulate
00:13:55accumulate through that they don't after the structure
00:14:00so those mutation don't break the structure
00:14:04they are accepted so if you wait years millennial,
00:14:10you can see the same protein for
00:14:12instance emoglobina that is it is for
00:14:17many many species or die animals says in different species.
00:14:26They are away you will see that the sequence is.
00:14:31Different what is the
00:14:34same so it's like the difference between proteins.
00:14:40The sequence is that is like a clock,
00:14:44what is the molecular clock e let us
00:14:48date when the two proteins or the species
00:14:54were split in the past so that we can from
00:14:59the commons and during
00:15:01evolution species split and so looking at the differences.
00:15:06You can sort of the evolutionary events
00:15:12of the past so of course you can have
00:15:15different levels of diversity diversity.
00:15:19You can have call
00:15:21close logs so cases in which the same functions same structure and
00:15:27similar sequence identity you see this is not so and
00:15:35so that you see you are
00:15:40the distance is not that is the same protein one.
00:15:45se sei a un ballo G. Means that two proteins thought to have
00:15:54the same access a common sense and it is in influence
00:16:01so the fact that we are saying they are
00:16:03present identity it means that they count from same.
00:16:10And as you can be more
00:16:14specific so when you say to proteins are you are just saying.
00:16:19The two proteins are the light they have common.
00:16:24You can also say if they are or patologo logos or logos means
00:16:33the two proteins are in
00:16:34different species logos have to proteins inside species.
00:16:41So you have a copy of the same protein.
00:16:45Dei bot Cam from common sense for the point
00:16:49during evolution our jeans one of our jeans,
00:16:53split duplicate duplicate and became for instance.
00:17:00We have emoglobina also have
00:17:02mioglobina and other global like proteins they
00:17:07come from nation on site
00:17:11and we have to change with different functions.
00:17:16So we have omologo proteins,
00:17:19they are later and we have different species and the same species.
00:17:25Then we can also classified them
00:17:28as close omologhe If they also shared high level
00:17:32of sequence identity or discount or remote demo share low value of
00:17:43Last case that is
00:17:47a convergence evolution so you would have to different jeans that
00:17:54are different starting point accumulate
00:17:57the mutation or to a different structures
00:18:02that could be very similar to structures coming from a
00:18:06different so this is called convergence evolution and proteins.
00:18:12They are not omologo
00:18:14anymore because they come from different and their call.
00:18:18Analogo so no logos meaning saying shape same function,
00:18:25but different answer is not clear.
00:18:34Also at the basis of Biology Evolutionary Biology.
00:18:39So we started look at di composition of things that we know today
00:18:49and start making influence about what is
00:18:54the moment of proteins or so or what is the art of proteins,
00:18:59the table and if their many who they are if
00:19:04they are shared across pieces and that and the way this is done.
00:19:10This is by classified proteins from
00:19:14structure perspectives and identifying
00:19:20common patterns structure elements our common.
00:19:27Elements so one example that that
00:19:31is the place of the hypothesis that from the same is
00:19:35that are super structure elements
00:19:38so seems like proteins are made or many
00:19:42Are made of elementary small local information.
00:19:51And the hypothesis is that this is the art of complex proteins are
00:19:57the rest of the combination of this small elements in technologies.
00:20:04Year of an example and year you have
00:20:08an hypothesis hypothesis filogenetiche
00:20:12showing duplicazione or evolutionary events during time.
00:20:21We also so some cases in which that
00:20:25could explain why sometimes you see between
00:20:30consequences and structures and for instance you
00:20:33see how duplicate works and how you can have
00:20:37what is called permutazione where you see
00:20:41on a fragment of protein origin gut
00:20:47and the beginning of it seems like it translated or the place of
00:20:56protein you see how that is possible so the is
00:21:00duplicate and then is cut in a different position.
00:21:08And therefore as a said how we about what is
00:21:14the structural complexity in
00:21:17leaving organize was not classified proteins,
00:21:22but classified what are call
00:21:24domains proteins are made not just of combination small elements,
00:21:30but also higher level they are combination what are called
00:21:35domains domains can be to a Small proteins,
00:21:40independent proteins and so you can
00:21:44cut proteins between two domains and you want break
00:21:47structure so te to stay fold
00:21:50and that is the classical definition of domain.
00:21:54So a domain is a model or a component of protein that
00:22:02can fold autonomous and it is stable
00:22:07even when the best of protein is the most.
00:22:13And domain name also evolve
00:22:18independent and you can find
00:22:21the same domain in
00:22:23different combination of other domain across proteins.
00:22:26So we started classified not proteins,
00:22:29but domain so fragment or portion of proteins that are minimal,
00:22:36but the structural functional level qui sotto
00:22:39Databases Scott and Cat.
00:22:43Negli Focusing we focus on cat based is the same for instance in
00:22:52cat di acronimi You see the present different levels of
00:22:58classification classification world class
00:23:03at the first level where you just
00:23:05estimate or evaluate di the content of elements.
00:23:11So you can have proteins that are al alpha o beta and you can have.
00:23:19To you have the second level scale architecture
00:23:25and where you have the general Advancement of structure.
00:23:50Where there is you need someone who decide where
00:23:54is this going at the lower level
00:23:59KEE you have to you are talking about
00:24:03the fold and this is precise so you see for example.
00:24:07IT sandwich you have to be Typekit in
00:24:10the middle stand by two layers
00:24:13of this is this is true for the protein is that are the year,
00:24:19but actually they are different.
00:24:23If you are you and then you get that they are similar so same.
00:24:30And there is no other level that is
00:24:34the second Ma logos Super family in this case you also
00:24:41evaluate sequence identity so you distinguished you should
00:24:46be able to distinguished omologo from thesis.
00:24:51So if sequence identity of two proteins that are to FLAURA2 this.
00:25:01Is very likely will be within
00:25:06the same omologhe super family will be in the two families.
00:25:14So the thing that is to
00:25:17generate Markov models for every models super family,
00:25:24so you keep all you have you super or them.
00:25:29You create multiple sequence and out of
00:25:32this multiple element you can build holistic model to search for
00:25:38similar sequence is in
00:25:40the logical database is this is something that a
00:25:43cover on the other course that they di first master.
00:25:54And of course. If you the search
00:25:57di analysis distribution of domain architecture in the PDB,
00:26:05where you have the structures or you use those models to search
00:26:09this domains in complete proteins or
00:26:15you can actually notice that their bias
00:26:19PDB that you know how so structures that
00:26:26the PDB are not just present or proteins just
00:26:32simply the presenti what are the of
00:26:35interest from the scientific community.
00:26:39What are the structures that are
00:26:41analysis by the scientific community be some proteins you
00:26:45can not the crystal or for them
00:26:48because maybe they are to large they are two unstable.
00:26:52They need to be stable and that.
00:27:00So after this.
00:27:04We started to look and competitive modeling task
00:27:08so di algorithm can e super simple.
00:27:17IT simply look for similar sequence you have your mind.
00:27:23The task is the structure of
00:27:25protein so what is the capture sequence.
00:27:31Just few slides you See when you have identity,
00:27:36you can safe to have to have
00:27:41the same structure so this is the principal you want to
00:27:44exploit so you have your input sequence,
00:27:47you world to search the PDB and see If the other protein
00:27:52is that have same or similar sequence and then
00:27:57project transfer coordinate that protein to
00:28:02your to your protein and that what is
00:28:08the first task is to search sequence
00:28:12against the PDB database and then you build
00:28:16model so you transfer the coordinate
00:28:18the main chain and then you model the side.
00:28:25So don't be full by the fact that your transfer
00:28:29coordinate the main chain Spike think about you
00:28:34have to proteins with the identity or identity it means
00:28:41that most of the site changes are
00:28:43different and they are on different orientation possible,
00:28:48but even if d identity is
00:28:53that it doesn't mean that the similarità between the law and
00:28:59also it doesn't mean that is
00:29:02pure quantitative comparison because
00:29:05the position of the Identico aminoacido is
00:29:07important so those identica position that
00:29:11are they are those that are so can can
00:29:17could be mutate during evolution so you project and transfer of
00:29:25all the main chain hotels and then Finding the best orientation
00:29:33that the side chain in
00:29:36the optimal orientation to minimal classes to optimizer.
00:29:41This is to form contacts.
00:29:44It's an easy task because the degree of
00:29:46freedom to just simple different orientation on
00:29:49site change is not light
00:29:51sampling the different confirmation of the main change.
00:30:02Qui without example using Swiss model how you can use
00:30:10vedi modern software with
00:30:14nice interface to to analyse the models that you are rating.
00:30:21So this was comparative modeling was dominate.
00:30:26Bioinformatica.
00:30:27For twenty years this was di
00:30:30the state of the art for modern structure you can
00:30:33push it also for a templates with lower sequence identity.
00:30:39If you need you don't have alternatives point.
00:30:46People started to noise.
00:30:50We need to have also
00:30:52a prediction methods so methods that are not based on what you have
00:30:58in the in the PDB on
00:31:00templates that can actually generally and provide
00:31:04the prediction for any type of
00:31:06sequence that we want to to product so we so
00:31:14the cast the critical assessment of
00:31:16proteins prediction that challenge
00:31:19the access and methods for better prediction and make banking.
00:31:28He works you aim it is more.
00:31:39And provided productions than
00:31:43the few months structural evidence completed so
00:31:49experiments are perform and di coordinate native structure
00:31:54is collected experiments so we know what is the shape of
00:31:59structures and then evaluate
00:32:02how a close editions to the future of this time da
00:32:09challenges you can for instance product just contacts
00:32:13you can usually targets divided
00:32:17by a difficult yes to their some
00:32:22that are more difficult to product because their not similar to any
00:32:26other and so they are the.
00:32:29The models without place the real edition free model.
00:32:34Challenge and that are instead similar to
00:32:37the end of the day, they are different.
00:32:48Sono nate le slides is.
00:32:56About. This is the typical output
00:33:04and the challenge what you see based on the faculty of target.
00:33:09What is the age of GDT S GDT similar.
00:33:15Sulla base di team di outliers,
00:33:21traction supply to different levels of
00:33:23outliers and you see what in.
00:33:27Entra Fortin, but also in.
00:33:34This is dead line where you have the methods.
00:33:41We know what just because of folk started to
00:33:45become accurate also on difficult cases and
00:33:49this is the result of the thing with alpha fault
00:33:54and industries without you see the year.
00:34:01You see the night is also di the experimental structure so it's
00:34:10little di experimental accuracy
00:34:16and it is about the fact that it is easy or
00:34:20difficult so it's really very accurate that dei
00:34:24those targets where tested and cast Sto iniziando a small proteins.
00:34:30IT was very very accurate.
00:34:33Ok. The other methods we so.
00:34:38Is a rosetta Rosetta o dominate cast
00:34:44cast twelve until thousand sixteen that has
00:34:48been designed by David Baker who also on
00:34:51the Nobel prize couple years ago for chemistry,
00:34:57especially for protein design.
00:34:59What is the alpha fall developer.
00:35:05Nobel prize for production,
00:35:09but the Rosetta was to
00:35:13the structure starting from segment so di Rosetta
00:35:21is that according to the developers
00:35:24in the PDB is that you don't have the structures for
00:35:28all proteins out their but not least you have the structure of
00:35:34all possible local information of segment of mind the use or.
00:35:42Or similar so idea is that local information
00:35:47of short species of minacce minacce as
00:35:53well so you can exploit this team you
00:35:57can fragment all di structures that you have you
00:36:01created your pool of fragments you need to find
00:36:05the best assembly of those fragment to fit your.
00:36:12Io so is to find the subset of
00:36:18minimal fragments and this can be done by comparing
00:36:23sequence again and the table to the capture this.
00:36:32So what this formula that capture this and then
00:36:39the other difficult task is how you
00:36:42combine those fragments so what are the.
00:36:46Confirmation the moves that you have to do
00:36:50between two consecutive fragments and to do that.
00:36:57Rosetta exploits simulate and essential I test
00:37:04random different possible confirmation discard
00:37:11quickly ones and sampling
00:37:17di diagnosi landscape until di di global minimum is which.
00:37:23AM it uses discriminanti functions
00:37:29to calcolate and to discriminate immediately what are
00:37:34the wrong am informations
00:37:39and so this is the statistical tool that you how
00:37:44good likely how is your
00:37:49you're your prediction so essential given your solution,
00:37:54we can estimate estimate what is the probability of correctness
00:38:01for our solution in a Rosetta am.
00:38:08Of course you can use
00:38:11different types discriminanti functions and know.
00:38:16If you remember bias conditional probability,
00:38:20you can factor the write probability probability
00:38:26of correct structures given the sequence.
00:38:29You can write in this way è essential you
00:38:33can am calcolate or write other things for instance
00:38:39the probability of the year can be proporzionale
00:38:43to di radius max di spread
00:38:49of outcomes around di center of mass or you can also
00:38:57this probability of sequence given
00:39:01the structure and you can for instance rappresentate as
00:39:05the probability of specific pair of
00:39:08assets given and environment where the enviroment can
00:39:14be seconds or exposure solve
00:39:19or you can also write something more precise.
00:39:23That's ready to the statistical potential that we so this
00:39:27is the probability of one aminoacidi to be specific environment.
00:39:32So what is the best of it is to
00:39:36be on the surface or certain level of exposure to this.
00:39:42Is very similar to each PDF potential is what is
00:39:48the probability of the specific used to be the given distance.
00:39:55So again.
00:39:58This is also similar to this one,
00:40:00but in this case.
00:40:01We are the distance between two aminoacidi.
00:40:05And again. This one can capture some aspect of
00:40:10the structure like weather to aminoacidi
00:40:14that are for instance of charge at
00:40:18the distant in the surface to be more closer to be
00:40:23in contact or it to
00:40:24idrofobiche idrofili amminoacidi are to close in space.
00:40:31It's also weird because them to be on the surface of them to be
00:40:36certain distance and so on so you
00:40:39can be based on the structures that you have addicted.
00:40:44Whether the distance is are the end,
00:40:51of course you can complicate,
00:40:52it even more and also integrate di environment so solving
00:40:58exposure or secondary structure and
00:41:01so probability of the specific amminoacidi.
00:41:04To be at the given distance in the given environment
00:41:08for the first time given environment for the second minors.
00:41:14What is about it was the meaning of this thing just
00:41:18looking at Ad same probability when you
00:41:22consider environment or when you don't
00:41:26consider environment and how this energy are calcolate
00:41:31pending on the different so both opposite charges or same charges.
00:41:41So this was a little big complicate.
00:41:45In d PDF probably slides are better, but don't member.
00:41:56This is this slides Sorry.
00:42:10About this probability are completed.
00:42:24Okay, dev'essere anche easy.
00:42:31Less technical.
00:42:33So what is what about other types of proteins?
00:42:40What are the other call?
00:42:42No global proteins.
00:42:45So what we have seen a methods for the structure of proteins,
00:42:51where the specific topologie specific interaction pattern
00:42:57are very important to preserve the structure
00:43:01of the other types of proteins that are character by not
00:43:08having a fixed structure or just because different data.
00:43:14We don't have data like for member proteins
00:43:18their very difficult to cristalline and
00:43:21so am we don't have many examples al dialogo that are based on.
00:43:29Knowledge of parameters to work work that for this is the little.
00:43:40Escape what are di for global proteins.
00:43:45Other cases that are low complexity sequence is
00:43:51that way to classified proteins is
00:43:54just in the sequence and CarPlay di am.
00:44:00Entropy or the complexity of
00:44:03the amminoacidi sequence we have protein proteins that can adopt
00:44:09multiple informations in dimensional space and can
00:44:15change Confirmation vedi quickly they are very
00:44:19dynamic and their designed to be like
00:44:21that because they can perform different functions.
00:44:24We have we can have patogeni confirmation like aggregate amiloide.
00:44:32And non their non native
00:44:35that they are important and they are available in PDB
00:44:40so you have some patologiche conditions
00:44:45and proteins that are created
00:44:48petition of very small units
00:44:52or elements similar to di structure elements,
00:44:56but sometimes more complex or extended that they are
00:45:01character by being some of them
00:45:06are extended some properties
00:45:10so they offer a extended platform for interaction.
00:45:15Actions and usually use
00:45:21for designing new function or designing
00:45:23new proteins because we have a stable scaffold.
00:45:26You can modify units you can add
00:45:30insertion the increase length or the number of
00:45:34the elements and you will keep protein fold
00:45:39to the fact that protein is able to fold is very important.
00:45:45Not only for function that's
00:45:48the thing so important for the ability.
00:45:52If then that doesn't fold amid tight and soluble you can use it to.
00:46:01Save the fact that we have open structure closed structures.
00:46:09Your special they are especially
00:46:14Degenerate in terms of the first and the last units are
00:46:20different dei get fold following and day of
00:46:27local information switch switch is the start from
00:46:32usually one of the two sides and there is a cascade of folding,
00:46:37every unit is one after the other.
00:46:44Classification provided by the Java.
00:46:49Open source closed structure um solenoide are
00:46:55usually di open structure that are form by simmetrie.
00:47:03Seem that are not so they form
00:47:07super super radical structures
00:47:12instead for you see they are closed that they are usually called.
00:47:17And they are the most important classes.
00:47:24We saw algorithm to identifying elements or units.
00:47:31This is an example where you can build a multiple sequence in
00:47:36just kind of the structures and the connectivity of within element.
00:47:43And we so other other methods and other one is the plot.
00:47:50Plot is similar to contact map,
00:47:54but don't be fool because this case.
00:47:55We are just look at the sequence.
00:47:57So here is to use the sequence and see if we can find pattern that
00:48:06Typekit and again as an exploit in the fact
00:48:10that similar sequence sì, non so.
00:48:15If you were a pattern that
00:48:18similar somewhere when we compared sequence against itself.
00:48:22We should able to find the petition so you see this is the bottom.
00:48:27It's the same sequence, that's duplicate.
00:48:33And and you can also having version again like four bytes have.
00:48:44To this data that exploit this is just about
00:48:49finding the best diagonal in dot plot.
00:48:57Matrix and to extend.
00:49:02And we need to that is call,
00:49:09repetita and exploit as Anzi di KEE.
00:49:14And in this case it exploits it
00:49:18to transform the sequence in the profiles of amminoacidi.
00:49:24You have Ashley scale representation and what
00:49:29you want to find his di petition of
00:49:35them of pics or better you can
00:49:39actually find the foodie transform of the presentation of
00:49:44the sequence according to Ashley Ashley features and fine
00:49:50it in this five different profiles
00:49:53that are provided by the representation.
00:49:56You can find di fundamental frequency that
00:50:01but the present the petition or di a time series that to
00:50:06see in this presentation of
00:50:09di aminoacidi so you see you di the profiles for
00:50:13the frequency is that for you have di all possible dance or
00:50:22the Peaks corrispondente to
00:50:23the best thanks to best frequency the best
00:50:27period say that is capture by the presentation given the.
00:50:34Fundamental frequency seven you see seven is multiplier.
00:50:44For one five that are the peaks that you see you can period
00:50:54using this formula year and you know how Long
00:51:00is the the most probably stretch of the elements and other method.
00:51:07We so what instead in the state of the sequence,
00:51:13this is not able to the sequence you need to aim units,
00:51:22but in this case what you are to the present
00:51:26is the city of the coordinate so in
00:51:31this case you just have
00:51:33the coordinate you see
00:51:36di ossidazione the coordinate so the problem,
00:51:40of course on what dimension you and how to
00:51:45the structure is so what is the rotation to times,
00:51:52think and you can do you have you see
00:51:57local minimal local maxima you can calcolate
00:52:00how distant two peaks in the sequence,
00:52:04you can measure here and If there is variations.
00:52:12Between consecutive and setup for.
00:52:18A final final period.
00:52:21This is an example of period
00:52:27for a protein that multiple times and is the same.
00:52:35Profile for protein that globulare.
00:52:39So you see sometimes you find some signal for specific period.
00:52:44But in generale di the profiles in this way to a real time.
00:52:56Is not the database time for the proteins.
00:53:02I also provide some.
00:53:06Differences to Paper is published to calcolate
00:53:10some properties for proteins.
00:53:13Okay. Protein dynamics embedding
00:53:22is any topic would like to it.
00:53:27I have.
00:53:30A few minutes also tomorrow so if there is
00:53:34something specific you want to
00:53:37go to the Peter do not
00:53:40the most interesting part or molecular dynamics.
00:53:45If you think it's not. Super clear.
00:53:50Okay, so. End of Science.
00:54:00Okay. Okay. Thank you.