Lecture 2026/06/03 (Course review)

Assistente AI

Trascrizione

00:00:00I always go to.

00:00:07Learn and I think some somewhere to be detta.

00:00:22Somewhere. Where? You biologico is.

00:00:35The fact that you can have for instance in the case of emoglobina.

00:00:40You know how for change to say.

00:00:46So you have di entire complex or for change

00:00:51you can have just combination

00:00:53of change be so the two different wants,

00:00:57so you can just and you O you can have to copy on the same protein,

00:01:04sometimes they are different

00:01:08the second copy different information so they are

00:01:12the pit in the crystal and the port

00:01:16as an simmetriche asimmetriche unit of the, ovviamente.

00:01:29Years also important to distinguished what is what is

00:01:34the super position so the simple case is super position.

00:01:41I know what are the points in the first structure what are

00:01:47the points in the second structure and I

00:01:49know what corrispondo what so meaning?

00:01:54We have to object with the same number of Points and is to minimis

00:02:01the distance apply rotation and translations operations

00:02:06so the position algorithm one of them.

00:02:10The one that we so was the. So you see.

00:02:19Di di di target function just minimal distance this

00:02:22is usually generally calculator aviation.

00:02:28And ém di it simple you just

00:02:33calcolate center for I am geometriche center

00:02:38of the two side of points you translate

00:02:41one set to the other just calcolati in the distance between

00:02:46the two geometriche centers

00:02:49and once you have translated them one on top

00:02:52of the other you have just to find

00:02:54the optimal minimal this minimal Minimalisti.

00:02:58Today to this you can exploit

00:03:02and cross convergence between the two set of points

00:03:08and you can The compose this course

00:03:12usually various the composition essential I will find

00:03:16the matrix transform one set points into

00:03:21the other and optimal actions that you want you want to find.

00:03:30Questions for aliens.

00:03:35We we have to ad an additional task that is

00:03:40identifying what is the subset of points in

00:03:45the other structure that match the points or

00:03:49a subset of points in the first capture so you don't want

00:03:54general to online the two structures you want to find

00:04:00the large possible subset of

00:04:02points that are similar between the two structures,

00:04:06of course this is not possible.

00:04:09This is not easy because in

00:04:11principle you should you could compared all points

00:04:15versus on the other points in the structure or maybe there are

00:04:21multiple sub region in the structures

00:04:24that line with a partner structures,

00:04:27but maybe that something in the middle

00:04:30like and insertion or addition so

00:04:33there is seen various algorithm to the problem and.

00:04:42The first man introduce the concept

00:04:47of contact maps because some of them uses us contact maps,

00:04:52instead of coordinate so contact maps Thank you know everything.

00:04:57So what is this matrix and you can simply

00:05:00field just playing cut off and

00:05:05this comparison of all atom versus atom of

00:05:11the protein so it's simmetriche and just this is in Amstrong.

00:05:18When you can contact map you can

00:05:21the second elements you member alpha is.

00:05:29Close to the main diagonal because this is are made

00:05:35of contacts between close amminoacidi

00:05:40in the sequence and the diagonal present

00:05:44consecutive used if you

00:05:48get away a little bit from the main diagonal.

00:05:52You are evaluate Aminoacidi.

00:05:55Close to each other in the sequence

00:05:59if you see contact away from the main diagonal.

00:06:05You are finding contact between aminoacidi

00:06:08very distant in the sequence and you can

00:06:13continuous stretch of contacts and

00:06:16sometimes present bit sheet Antipatro bit sheets

00:06:21when you are your form contacts between

00:06:25one stretch and stretch somewhere in the sequence,

00:06:30but in the opposite direction and so you

00:06:33have to see you can have you have

00:06:37a block that is interacting with another block

00:06:41in the sequence and with the same orders.

00:06:51So. Have to find

00:06:56subset to possible number of equivalent points.

00:07:03You see within small nations so you have to the side

00:07:08with you consider as an accept table threshold to say

00:07:13a set points are equivalent and usually you see you

00:07:18need to you have to set points that are six away.

00:07:25They are not that similar

00:07:29so you can do that we so for different is.

00:07:39One is.

00:07:45Based on dynamic programming no

00:07:49daily based on this is so comparison of

00:07:54distance sys combinatoria extension

00:07:59and idea was to compared fragments and then

00:08:04construct what is extend those fragments to cover as

00:08:08possible di the Siemens and that is

00:08:13an euristiche that exploits

00:08:16another distance measure that not to be that

00:08:22actually is less sensitive to

00:08:25outliers so if you find the good element and you have

00:08:29a few positions that are you are

00:08:32not that because of this is positions,

00:08:37but you are you them so it's

00:08:42its superior performance in identifying related structures.

00:08:54You can look at the details bit more than the slides,

00:09:01but what you have to the member is the fact that you

00:09:04have been in class for different types of a algorithm.

00:09:10They are more effective pending on

00:09:14the situation and they are using different strategies.

00:09:18So the idea is for you to the member that

00:09:23you can use different strategies to align to structures.

00:09:28We are different and their.

00:09:34Questions.

00:09:41Okay, so so super position with so.

00:09:47What you have structural and?

00:09:51and when you have a lot of

00:09:55your databases when you start up to structures you can

00:10:00start making a hypothesis about the evolution of proteins as we

00:10:06so about evolution and we

00:10:11started from one important observation that is.

00:10:18That for similar structures you can see

00:10:23any level of sequence identity

00:10:28between the two so you can have cases,

00:10:30which sequence identity is very as well as is in the structure of

00:10:36the same sequence identity is now so the diesel di.

00:10:44Di di behind this is of course

00:10:48and intuition that is that we have to see is

00:10:52that are similar also the structures to be

00:10:55similar is different aspect

00:10:59also the structures to be different different.

00:11:03This is not the case for protein.

00:11:05Because if you have to structures that are very similar.

00:11:11You also have cases in with the sequence is very different.

00:11:16How is that possible and how we can exploits?

00:11:19This or making evolution is about proteins,

00:11:25while we started to the same exercise,

00:11:28but using place of proteins that for have

00:11:32different structures and what you see here that

00:11:36you will never see any of proteins with different structure and

00:11:43high sequence identity or better threshold About which if di am.

00:11:54The sequence identity is that that we can

00:12:00say that the structure are similar so this is what this plot.

00:12:07This experiment is in us so you will

00:12:09never find different structures with

00:12:13sequence identity higher than

00:12:16so if sequence against the database and I.

00:12:21With at least not for the identity.

00:12:29I can say that they have the same structure of the

00:12:34same from the same overall shape of the dimension information and

00:12:39so based on this you can actually do

00:12:45not you can actually Transfer

00:12:48information to a cross between proteins.

00:12:54And super important.

00:12:56This is the basis of modern.

00:12:59Bioinformatica is the basis of modeling,

00:13:04the fuck that you can make it is about

00:13:07evolution and indeed one important aspects.

00:13:13Is one to one you have you can sequence so

00:13:22sequence divergenze so the hypothesis is

00:13:25that the important aspect for

00:13:29proteins to function is the structure.

00:13:32If you break the structure you are sort

00:13:35of breaking the functional of protein so what

00:13:39you can during the evolution is

00:13:41that two proteins in two different species

00:13:44can accumulate mutation so their sequence is that.

00:13:51And this mutation are accumulate

00:13:55accumulate through that they don't after the structure

00:14:00so those mutation don't break the structure

00:14:04they are accepted so if you wait years millennial,

00:14:10you can see the same protein for

00:14:12instance emoglobina that is it is for

00:14:17many many species or die animals says in different species.

00:14:26They are away you will see that the sequence is.

00:14:31Different what is the

00:14:34same so it's like the difference between proteins.

00:14:40The sequence is that is like a clock,

00:14:44what is the molecular clock e let us

00:14:48date when the two proteins or the species

00:14:54were split in the past so that we can from

00:14:59the commons and during

00:15:01evolution species split and so looking at the differences.

00:15:06You can sort of the evolutionary events

00:15:12of the past so of course you can have

00:15:15different levels of diversity diversity.

00:15:19You can have call

00:15:21close logs so cases in which the same functions same structure and

00:15:27similar sequence identity you see this is not so and

00:15:35so that you see you are

00:15:40the distance is not that is the same protein one.

00:15:45se sei a un ballo G. Means that two proteins thought to have

00:15:54the same access a common sense and it is in influence

00:16:01so the fact that we are saying they are

00:16:03present identity it means that they count from same.

00:16:10And as you can be more

00:16:14specific so when you say to proteins are you are just saying.

00:16:19The two proteins are the light they have common.

00:16:24You can also say if they are or patologo logos or logos means

00:16:33the two proteins are in

00:16:34different species logos have to proteins inside species.

00:16:41So you have a copy of the same protein.

00:16:45Dei bot Cam from common sense for the point

00:16:49during evolution our jeans one of our jeans,

00:16:53split duplicate duplicate and became for instance.

00:17:00We have emoglobina also have

00:17:02mioglobina and other global like proteins they

00:17:07come from nation on site

00:17:11and we have to change with different functions.

00:17:16So we have omologo proteins,

00:17:19they are later and we have different species and the same species.

00:17:25Then we can also classified them

00:17:28as close omologhe If they also shared high level

00:17:32of sequence identity or discount or remote demo share low value of

00:17:43Last case that is

00:17:47a convergence evolution so you would have to different jeans that

00:17:54are different starting point accumulate

00:17:57the mutation or to a different structures

00:18:02that could be very similar to structures coming from a

00:18:06different so this is called convergence evolution and proteins.

00:18:12They are not omologo

00:18:14anymore because they come from different and their call.

00:18:18Analogo so no logos meaning saying shape same function,

00:18:25but different answer is not clear.

00:18:34Also at the basis of Biology Evolutionary Biology.

00:18:39So we started look at di composition of things that we know today

00:18:49and start making influence about what is

00:18:54the moment of proteins or so or what is the art of proteins,

00:18:59the table and if their many who they are if

00:19:04they are shared across pieces and that and the way this is done.

00:19:10This is by classified proteins from

00:19:14structure perspectives and identifying

00:19:20common patterns structure elements our common.

00:19:27Elements so one example that that

00:19:31is the place of the hypothesis that from the same is

00:19:35that are super structure elements

00:19:38so seems like proteins are made or many

00:19:42Are made of elementary small local information.

00:19:51And the hypothesis is that this is the art of complex proteins are

00:19:57the rest of the combination of this small elements in technologies.

00:20:04Year of an example and year you have

00:20:08an hypothesis hypothesis filogenetiche

00:20:12showing duplicazione or evolutionary events during time.

00:20:21We also so some cases in which that

00:20:25could explain why sometimes you see between

00:20:30consequences and structures and for instance you

00:20:33see how duplicate works and how you can have

00:20:37what is called permutazione where you see

00:20:41on a fragment of protein origin gut

00:20:47and the beginning of it seems like it translated or the place of

00:20:56protein you see how that is possible so the is

00:21:00duplicate and then is cut in a different position.

00:21:08And therefore as a said how we about what is

00:21:14the structural complexity in

00:21:17leaving organize was not classified proteins,

00:21:22but classified what are call

00:21:24domains proteins are made not just of combination small elements,

00:21:30but also higher level they are combination what are called

00:21:35domains domains can be to a Small proteins,

00:21:40independent proteins and so you can

00:21:44cut proteins between two domains and you want break

00:21:47structure so te to stay fold

00:21:50and that is the classical definition of domain.

00:21:54So a domain is a model or a component of protein that

00:22:02can fold autonomous and it is stable

00:22:07even when the best of protein is the most.

00:22:13And domain name also evolve

00:22:18independent and you can find

00:22:21the same domain in

00:22:23different combination of other domain across proteins.

00:22:26So we started classified not proteins,

00:22:29but domain so fragment or portion of proteins that are minimal,

00:22:36but the structural functional level qui sotto

00:22:39Databases Scott and Cat.

00:22:43Negli Focusing we focus on cat based is the same for instance in

00:22:52cat di acronimi You see the present different levels of

00:22:58classification classification world class

00:23:03at the first level where you just

00:23:05estimate or evaluate di the content of elements.

00:23:11So you can have proteins that are al alpha o beta and you can have.

00:23:19To you have the second level scale architecture

00:23:25and where you have the general Advancement of structure.

00:23:50Where there is you need someone who decide where

00:23:54is this going at the lower level

00:23:59KEE you have to you are talking about

00:24:03the fold and this is precise so you see for example.

00:24:07IT sandwich you have to be Typekit in

00:24:10the middle stand by two layers

00:24:13of this is this is true for the protein is that are the year,

00:24:19but actually they are different.

00:24:23If you are you and then you get that they are similar so same.

00:24:30And there is no other level that is

00:24:34the second Ma logos Super family in this case you also

00:24:41evaluate sequence identity so you distinguished you should

00:24:46be able to distinguished omologo from thesis.

00:24:51So if sequence identity of two proteins that are to FLAURA2 this.

00:25:01Is very likely will be within

00:25:06the same omologhe super family will be in the two families.

00:25:14So the thing that is to

00:25:17generate Markov models for every models super family,

00:25:24so you keep all you have you super or them.

00:25:29You create multiple sequence and out of

00:25:32this multiple element you can build holistic model to search for

00:25:38similar sequence is in

00:25:40the logical database is this is something that a

00:25:43cover on the other course that they di first master.

00:25:54And of course. If you the search

00:25:57di analysis distribution of domain architecture in the PDB,

00:26:05where you have the structures or you use those models to search

00:26:09this domains in complete proteins or

00:26:15you can actually notice that their bias

00:26:19PDB that you know how so structures that

00:26:26the PDB are not just present or proteins just

00:26:32simply the presenti what are the of

00:26:35interest from the scientific community.

00:26:39What are the structures that are

00:26:41analysis by the scientific community be some proteins you

00:26:45can not the crystal or for them

00:26:48because maybe they are to large they are two unstable.

00:26:52They need to be stable and that.

00:27:00So after this.

00:27:04We started to look and competitive modeling task

00:27:08so di algorithm can e super simple.

00:27:17IT simply look for similar sequence you have your mind.

00:27:23The task is the structure of

00:27:25protein so what is the capture sequence.

00:27:31Just few slides you See when you have identity,

00:27:36you can safe to have to have

00:27:41the same structure so this is the principal you want to

00:27:44exploit so you have your input sequence,

00:27:47you world to search the PDB and see If the other protein

00:27:52is that have same or similar sequence and then

00:27:57project transfer coordinate that protein to

00:28:02your to your protein and that what is

00:28:08the first task is to search sequence

00:28:12against the PDB database and then you build

00:28:16model so you transfer the coordinate

00:28:18the main chain and then you model the side.

00:28:25So don't be full by the fact that your transfer

00:28:29coordinate the main chain Spike think about you

00:28:34have to proteins with the identity or identity it means

00:28:41that most of the site changes are

00:28:43different and they are on different orientation possible,

00:28:48but even if d identity is

00:28:53that it doesn't mean that the similarità between the law and

00:28:59also it doesn't mean that is

00:29:02pure quantitative comparison because

00:29:05the position of the Identico aminoacido is

00:29:07important so those identica position that

00:29:11are they are those that are so can can

00:29:17could be mutate during evolution so you project and transfer of

00:29:25all the main chain hotels and then Finding the best orientation

00:29:33that the side chain in

00:29:36the optimal orientation to minimal classes to optimizer.

00:29:41This is to form contacts.

00:29:44It's an easy task because the degree of

00:29:46freedom to just simple different orientation on

00:29:49site change is not light

00:29:51sampling the different confirmation of the main change.

00:30:02Qui without example using Swiss model how you can use

00:30:10vedi modern software with

00:30:14nice interface to to analyse the models that you are rating.

00:30:21So this was comparative modeling was dominate.

00:30:26Bioinformatica.

00:30:27For twenty years this was di

00:30:30the state of the art for modern structure you can

00:30:33push it also for a templates with lower sequence identity.

00:30:39If you need you don't have alternatives point.

00:30:46People started to noise.

00:30:50We need to have also

00:30:52a prediction methods so methods that are not based on what you have

00:30:58in the in the PDB on

00:31:00templates that can actually generally and provide

00:31:04the prediction for any type of

00:31:06sequence that we want to to product so we so

00:31:14the cast the critical assessment of

00:31:16proteins prediction that challenge

00:31:19the access and methods for better prediction and make banking.

00:31:28He works you aim it is more.

00:31:39And provided productions than

00:31:43the few months structural evidence completed so

00:31:49experiments are perform and di coordinate native structure

00:31:54is collected experiments so we know what is the shape of

00:31:59structures and then evaluate

00:32:02how a close editions to the future of this time da

00:32:09challenges you can for instance product just contacts

00:32:13you can usually targets divided

00:32:17by a difficult yes to their some

00:32:22that are more difficult to product because their not similar to any

00:32:26other and so they are the.

00:32:29The models without place the real edition free model.

00:32:34Challenge and that are instead similar to

00:32:37the end of the day, they are different.

00:32:48Sono nate le slides is.

00:32:56About. This is the typical output

00:33:04and the challenge what you see based on the faculty of target.

00:33:09What is the age of GDT S GDT similar.

00:33:15Sulla base di team di outliers,

00:33:21traction supply to different levels of

00:33:23outliers and you see what in.

00:33:27Entra Fortin, but also in.

00:33:34This is dead line where you have the methods.

00:33:41We know what just because of folk started to

00:33:45become accurate also on difficult cases and

00:33:49this is the result of the thing with alpha fault

00:33:54and industries without you see the year.

00:34:01You see the night is also di the experimental structure so it's

00:34:10little di experimental accuracy

00:34:16and it is about the fact that it is easy or

00:34:20difficult so it's really very accurate that dei

00:34:24those targets where tested and cast Sto iniziando a small proteins.

00:34:30IT was very very accurate.

00:34:33Ok. The other methods we so.

00:34:38Is a rosetta Rosetta o dominate cast

00:34:44cast twelve until thousand sixteen that has

00:34:48been designed by David Baker who also on

00:34:51the Nobel prize couple years ago for chemistry,

00:34:57especially for protein design.

00:34:59What is the alpha fall developer.

00:35:05Nobel prize for production,

00:35:09but the Rosetta was to

00:35:13the structure starting from segment so di Rosetta

00:35:21is that according to the developers

00:35:24in the PDB is that you don't have the structures for

00:35:28all proteins out their but not least you have the structure of

00:35:34all possible local information of segment of mind the use or.

00:35:42Or similar so idea is that local information

00:35:47of short species of minacce minacce as

00:35:53well so you can exploit this team you

00:35:57can fragment all di structures that you have you

00:36:01created your pool of fragments you need to find

00:36:05the best assembly of those fragment to fit your.

00:36:12Io so is to find the subset of

00:36:18minimal fragments and this can be done by comparing

00:36:23sequence again and the table to the capture this.

00:36:32So what this formula that capture this and then

00:36:39the other difficult task is how you

00:36:42combine those fragments so what are the.

00:36:46Confirmation the moves that you have to do

00:36:50between two consecutive fragments and to do that.

00:36:57Rosetta exploits simulate and essential I test

00:37:04random different possible confirmation discard

00:37:11quickly ones and sampling

00:37:17di diagnosi landscape until di di global minimum is which.

00:37:23AM it uses discriminanti functions

00:37:29to calcolate and to discriminate immediately what are

00:37:34the wrong am informations

00:37:39and so this is the statistical tool that you how

00:37:44good likely how is your

00:37:49you're your prediction so essential given your solution,

00:37:54we can estimate estimate what is the probability of correctness

00:38:01for our solution in a Rosetta am.

00:38:08Of course you can use

00:38:11different types discriminanti functions and know.

00:38:16If you remember bias conditional probability,

00:38:20you can factor the write probability probability

00:38:26of correct structures given the sequence.

00:38:29You can write in this way è essential you

00:38:33can am calcolate or write other things for instance

00:38:39the probability of the year can be proporzionale

00:38:43to di radius max di spread

00:38:49of outcomes around di center of mass or you can also

00:38:57this probability of sequence given

00:39:01the structure and you can for instance rappresentate as

00:39:05the probability of specific pair of

00:39:08assets given and environment where the enviroment can

00:39:14be seconds or exposure solve

00:39:19or you can also write something more precise.

00:39:23That's ready to the statistical potential that we so this

00:39:27is the probability of one aminoacidi to be specific environment.

00:39:32So what is the best of it is to

00:39:36be on the surface or certain level of exposure to this.

00:39:42Is very similar to each PDF potential is what is

00:39:48the probability of the specific used to be the given distance.

00:39:55So again.

00:39:58This is also similar to this one,

00:40:00but in this case.

00:40:01We are the distance between two aminoacidi.

00:40:05And again. This one can capture some aspect of

00:40:10the structure like weather to aminoacidi

00:40:14that are for instance of charge at

00:40:18the distant in the surface to be more closer to be

00:40:23in contact or it to

00:40:24idrofobiche idrofili amminoacidi are to close in space.

00:40:31It's also weird because them to be on the surface of them to be

00:40:36certain distance and so on so you

00:40:39can be based on the structures that you have addicted.

00:40:44Whether the distance is are the end,

00:40:51of course you can complicate,

00:40:52it even more and also integrate di environment so solving

00:40:58exposure or secondary structure and

00:41:01so probability of the specific amminoacidi.

00:41:04To be at the given distance in the given environment

00:41:08for the first time given environment for the second minors.

00:41:14What is about it was the meaning of this thing just

00:41:18looking at Ad same probability when you

00:41:22consider environment or when you don't

00:41:26consider environment and how this energy are calcolate

00:41:31pending on the different so both opposite charges or same charges.

00:41:41So this was a little big complicate.

00:41:45In d PDF probably slides are better, but don't member.

00:41:56This is this slides Sorry.

00:42:10About this probability are completed.

00:42:24Okay, dev'essere anche easy.

00:42:31Less technical.

00:42:33So what is what about other types of proteins?

00:42:40What are the other call?

00:42:42No global proteins.

00:42:45So what we have seen a methods for the structure of proteins,

00:42:51where the specific topologie specific interaction pattern

00:42:57are very important to preserve the structure

00:43:01of the other types of proteins that are character by not

00:43:08having a fixed structure or just because different data.

00:43:14We don't have data like for member proteins

00:43:18their very difficult to cristalline and

00:43:21so am we don't have many examples al dialogo that are based on.

00:43:29Knowledge of parameters to work work that for this is the little.

00:43:40Escape what are di for global proteins.

00:43:45Other cases that are low complexity sequence is

00:43:51that way to classified proteins is

00:43:54just in the sequence and CarPlay di am.

00:44:00Entropy or the complexity of

00:44:03the amminoacidi sequence we have protein proteins that can adopt

00:44:09multiple informations in dimensional space and can

00:44:15change Confirmation vedi quickly they are very

00:44:19dynamic and their designed to be like

00:44:21that because they can perform different functions.

00:44:24We have we can have patogeni confirmation like aggregate amiloide.

00:44:32And non their non native

00:44:35that they are important and they are available in PDB

00:44:40so you have some patologiche conditions

00:44:45and proteins that are created

00:44:48petition of very small units

00:44:52or elements similar to di structure elements,

00:44:56but sometimes more complex or extended that they are

00:45:01character by being some of them

00:45:06are extended some properties

00:45:10so they offer a extended platform for interaction.

00:45:15Actions and usually use

00:45:21for designing new function or designing

00:45:23new proteins because we have a stable scaffold.

00:45:26You can modify units you can add

00:45:30insertion the increase length or the number of

00:45:34the elements and you will keep protein fold

00:45:39to the fact that protein is able to fold is very important.

00:45:45Not only for function that's

00:45:48the thing so important for the ability.

00:45:52If then that doesn't fold amid tight and soluble you can use it to.

00:46:01Save the fact that we have open structure closed structures.

00:46:09Your special they are especially

00:46:14Degenerate in terms of the first and the last units are

00:46:20different dei get fold following and day of

00:46:27local information switch switch is the start from

00:46:32usually one of the two sides and there is a cascade of folding,

00:46:37every unit is one after the other.

00:46:44Classification provided by the Java.

00:46:49Open source closed structure um solenoide are

00:46:55usually di open structure that are form by simmetrie.

00:47:03Seem that are not so they form

00:47:07super super radical structures

00:47:12instead for you see they are closed that they are usually called.

00:47:17And they are the most important classes.

00:47:24We saw algorithm to identifying elements or units.

00:47:31This is an example where you can build a multiple sequence in

00:47:36just kind of the structures and the connectivity of within element.

00:47:43And we so other other methods and other one is the plot.

00:47:50Plot is similar to contact map,

00:47:54but don't be fool because this case.

00:47:55We are just look at the sequence.

00:47:57So here is to use the sequence and see if we can find pattern that

00:48:06Typekit and again as an exploit in the fact

00:48:10that similar sequence sì, non so.

00:48:15If you were a pattern that

00:48:18similar somewhere when we compared sequence against itself.

00:48:22We should able to find the petition so you see this is the bottom.

00:48:27It's the same sequence, that's duplicate.

00:48:33And and you can also having version again like four bytes have.

00:48:44To this data that exploit this is just about

00:48:49finding the best diagonal in dot plot.

00:48:57Matrix and to extend.

00:49:02And we need to that is call,

00:49:09repetita and exploit as Anzi di KEE.

00:49:14And in this case it exploits it

00:49:18to transform the sequence in the profiles of amminoacidi.

00:49:24You have Ashley scale representation and what

00:49:29you want to find his di petition of

00:49:35them of pics or better you can

00:49:39actually find the foodie transform of the presentation of

00:49:44the sequence according to Ashley Ashley features and fine

00:49:50it in this five different profiles

00:49:53that are provided by the representation.

00:49:56You can find di fundamental frequency that

00:50:01but the present the petition or di a time series that to

00:50:06see in this presentation of

00:50:09di aminoacidi so you see you di the profiles for

00:50:13the frequency is that for you have di all possible dance or

00:50:22the Peaks corrispondente to

00:50:23the best thanks to best frequency the best

00:50:27period say that is capture by the presentation given the.

00:50:34Fundamental frequency seven you see seven is multiplier.

00:50:44For one five that are the peaks that you see you can period

00:50:54using this formula year and you know how Long

00:51:00is the the most probably stretch of the elements and other method.

00:51:07We so what instead in the state of the sequence,

00:51:13this is not able to the sequence you need to aim units,

00:51:22but in this case what you are to the present

00:51:26is the city of the coordinate so in

00:51:31this case you just have

00:51:33the coordinate you see

00:51:36di ossidazione the coordinate so the problem,

00:51:40of course on what dimension you and how to

00:51:45the structure is so what is the rotation to times,

00:51:52think and you can do you have you see

00:51:57local minimal local maxima you can calcolate

00:52:00how distant two peaks in the sequence,

00:52:04you can measure here and If there is variations.

00:52:12Between consecutive and setup for.

00:52:18A final final period.

00:52:21This is an example of period

00:52:27for a protein that multiple times and is the same.

00:52:35Profile for protein that globulare.

00:52:39So you see sometimes you find some signal for specific period.

00:52:44But in generale di the profiles in this way to a real time.

00:52:56Is not the database time for the proteins.

00:53:02I also provide some.

00:53:06Differences to Paper is published to calcolate

00:53:10some properties for proteins.

00:53:13Okay. Protein dynamics embedding

00:53:22is any topic would like to it.

00:53:27I have.

00:53:30A few minutes also tomorrow so if there is

00:53:34something specific you want to

00:53:37go to the Peter do not

00:53:40the most interesting part or molecular dynamics.

00:53:45If you think it's not. Super clear.

00:53:50Okay, so. End of Science.

00:54:00Okay. Okay. Thank you.