By: , May 30, 2013 3:39 pm
Room: Moderated Discussions
Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on May 30, 2013 1:13 pm wrote:
> Ricardo B (ricardo.b.delete@this.xxxxx.xx) on May 30, 2013 11:18 am wrote:
> > Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on May 30, 2013 9:48 am wrote:
> >
> > > - So it seems to me that the AGU almost computes a "tag" for the CPU to use to refer themselves to
> > > the actual data entires, whilst the DTLB holds the actual physical location of these data entries?
> > > Or do I still have something wrong? The AGU seems very ambigious to me... Or perhaps the AGU is the
> > > unit that requests data to be fed to the other execution units? Recieving instructions from the scheduler
> > > to request certain data entries, which moves onto the DTLB which moves to the L1/L2?
> >
> >
> > There are two (main) types of memory addresses in a modern CPU: virtual and physical.
> > In general, software handles virtual address.
> >
> > The operative system sets up a page table, which is just a big look up table which maps pages (4KB
> > in x86) of virtual addresses into physical addresses and also contains access permissions.
> >
> > Ie, something like
> > Virtual address => physical address, permissions, etc
> > 0x00000000 => 0x19831, read, write, no execute
> > 0x00000001 => 0x32911, read, no write, execute
> > ...
> > This mechanism is required for any operative system more robust than DOS.
> >
> > But before we go to the page table, an x86 instruction which accesses memory
> > can use pretty complicated addressing, such as DS:EBX + 4*ESI + 39;
> > The AGU does all the math and comes up with a single number, which is called the virtual address.
> >
> > The DTLB then performs a look up in the page table, to convert the virtual
> > address into a physical memory address, permission bits, etc
> >
> >
> > > - So the store buffer is very unclear to me. Your explanation probably makes perfect sense,
> > > it's just me who probably doesnt understand. Though what I don't understand is; where do these
> > > store requests come from? I understand that the store buffer holds data before it is commited
> > > to the data caches, but where does this data come from? The execution units? If so, why would
> > > it be redirected back to the data caches if they were already "done" with?
> >
> > The stores come from instructions which store data into memory.
> >
> > > - So the prefetcher sortof acts like the data part of a loop detector? Or other patterns?
> >
> > Yep, probably something like that. Only Intel and God know the details.
> >
>
> Thanks again for the informative reply!
>
> I think I'll go in reverse order since that would be most chronological from smallest to largest.
>
> - Does the prefetcher use data path bandwidth aggressively? Or does it simply use it when there is "free
> bandwidth" to load predicted data needed? I'd imagine that the latter would be most beneficial and safe.
>
> - So the stores come from instructions, hrm... So if an instruction was something like: (mul a,b
> = c), it would store c back into memory? Does that go directly into memory, or through the caches?
> If it goes directly to memory, does that have it's own path dedicated to this purpose?
>
> > > - So it does seem that a virtual address is a "tag" that the AGU generates for use by the execution
> > > units. So, the flow seems to be; ROB> Data Scheduler AGU (converts instruction addresses/physical
> > > into virtual addresses) DTLB? If this is correct, I have a question that probably has a really
> > > obvious answer... Why can't instruction's just use the physical address itself instead of some
> > > encrypted address for the AGU to figure out? And moreso, why does the CPU need to use virtual
> > > addresses? Why can't it just use physical addresses the whole time?
>
> As always, much appreciation to all your informative answers!
Oh yes, and before I forget! I have one last question that is probably easy to answer.
It may be a bit unrelated, so I apologize for it, but simply; what fors Hyperthreading do?
Originally, I read that it allowed both the integer unit and the floating point unit to work simultaneously, as before I thought that the FP unit and ALU were entirely seperate, singular units, and that the scheduler could only dispatch a single instruction at a time, though now it seems that the FP and ALU units are mixed in with eachother and not singular units, and that the scheduler can dispatch a number of instructions to both any AL and FP unit at any cycle, given that the ports are free for it.
This makes me understanding of hyperthreading kind of... No longer valid.
If it isnt too much trouble, could you please also answer that too?
Thank you very much!
> Ricardo B (ricardo.b.delete@this.xxxxx.xx) on May 30, 2013 11:18 am wrote:
> > Sebastian Soeiro (sebastian_2896.delete@this.hotmail.com) on May 30, 2013 9:48 am wrote:
> >
> > > - So it seems to me that the AGU almost computes a "tag" for the CPU to use to refer themselves to
> > > the actual data entires, whilst the DTLB holds the actual physical location of these data entries?
> > > Or do I still have something wrong? The AGU seems very ambigious to me... Or perhaps the AGU is the
> > > unit that requests data to be fed to the other execution units? Recieving instructions from the scheduler
> > > to request certain data entries, which moves onto the DTLB which moves to the L1/L2?
> >
> >
> > There are two (main) types of memory addresses in a modern CPU: virtual and physical.
> > In general, software handles virtual address.
> >
> > The operative system sets up a page table, which is just a big look up table which maps pages (4KB
> > in x86) of virtual addresses into physical addresses and also contains access permissions.
> >
> > Ie, something like
> > Virtual address => physical address, permissions, etc
> > 0x00000000 => 0x19831, read, write, no execute
> > 0x00000001 => 0x32911, read, no write, execute
> > ...
> > This mechanism is required for any operative system more robust than DOS.
> >
> > But before we go to the page table, an x86 instruction which accesses memory
> > can use pretty complicated addressing, such as DS:EBX + 4*ESI + 39;
> > The AGU does all the math and comes up with a single number, which is called the virtual address.
> >
> > The DTLB then performs a look up in the page table, to convert the virtual
> > address into a physical memory address, permission bits, etc
> >
> >
> > > - So the store buffer is very unclear to me. Your explanation probably makes perfect sense,
> > > it's just me who probably doesnt understand. Though what I don't understand is; where do these
> > > store requests come from? I understand that the store buffer holds data before it is commited
> > > to the data caches, but where does this data come from? The execution units? If so, why would
> > > it be redirected back to the data caches if they were already "done" with?
> >
> > The stores come from instructions which store data into memory.
> >
> > > - So the prefetcher sortof acts like the data part of a loop detector? Or other patterns?
> >
> > Yep, probably something like that. Only Intel and God know the details.
> >
>
> Thanks again for the informative reply!
>
> I think I'll go in reverse order since that would be most chronological from smallest to largest.
>
> - Does the prefetcher use data path bandwidth aggressively? Or does it simply use it when there is "free
> bandwidth" to load predicted data needed? I'd imagine that the latter would be most beneficial and safe.
>
> - So the stores come from instructions, hrm... So if an instruction was something like: (mul a,b
> = c), it would store c back into memory? Does that go directly into memory, or through the caches?
> If it goes directly to memory, does that have it's own path dedicated to this purpose?
>
> > > - So it does seem that a virtual address is a "tag" that the AGU generates for use by the execution
> > > units. So, the flow seems to be; ROB> Data Scheduler AGU (converts instruction addresses/physical
> > > into virtual addresses) DTLB? If this is correct, I have a question that probably has a really
> > > obvious answer... Why can't instruction's just use the physical address itself instead of some
> > > encrypted address for the AGU to figure out? And moreso, why does the CPU need to use virtual
> > > addresses? Why can't it just use physical addresses the whole time?
>
> As always, much appreciation to all your informative answers!
Oh yes, and before I forget! I have one last question that is probably easy to answer.
It may be a bit unrelated, so I apologize for it, but simply; what fors Hyperthreading do?
Originally, I read that it allowed both the integer unit and the floating point unit to work simultaneously, as before I thought that the FP unit and ALU were entirely seperate, singular units, and that the scheduler could only dispatch a single instruction at a time, though now it seems that the FP and ALU units are mixed in with eachother and not singular units, and that the scheduler can dispatch a number of instructions to both any AL and FP unit at any cycle, given that the ports are free for it.
This makes me understanding of hyperthreading kind of... No longer valid.
If it isnt too much trouble, could you please also answer that too?
Thank you very much!