<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: More on __restrict</title>
	<atom:link href="http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/feed/" rel="self" type="application/rss+xml" />
	<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/</link>
	<description>Technical Notes On Game Development</description>
	<lastBuildDate>Wed, 28 Dec 2011 10:00:34 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Gregory</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-5154</link>
		<dc:creator>Gregory</dc:creator>
		<pubDate>Wed, 17 Feb 2010 10:36:28 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-5154</guid>
		<description>Here is the working URL for Mike Acton&#039;s article on understanding strict aliasing:

http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html</description>
		<content:encoded><![CDATA[<p>Here is the working URL for Mike Acton&#8217;s article on understanding strict aliasing:</p>
<p><a href="http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html" rel="nofollow">http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Elan</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-40</link>
		<dc:creator>Elan</dc:creator>
		<pubDate>Thu, 30 Oct 2008 00:10:01 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-40</guid>
		<description>Well in general I think you&#039;re better off making an algorithmic decision up front about whether or not the pointers are allowed to be aliased at all. In many cases having aliased pointers to structures is a sign that something has gone wrong -- consider what would happen inside (contrived example)
&lt;pre&gt;void CrossProductNoStackCopy( Vector *out, Vector *in1, Vector *in2 )
{
  out-&gt;x = in1-&gt;y * in2-&gt;z - in1-&gt;z * in2-&gt;y ;
  out-&gt;y = in1-&gt;z * in2-&gt;x - in1-&gt;x * in2-&gt;z ;
  out-&gt;z = in1-&gt;x * in2-&gt;y - in1-&gt;y * in2-&gt;x ;
}&lt;/pre&gt;
if out and in1 were the same pointer. 

But if you really really needed to make the determination at runtime I guess you could do something like
&lt;pre&gt;void func(int *a, int *b) 
{
   if ( a == b )
   {
      // aliased code
   }
   else
   {  // we know they are not aliased
      int * __restrict pA = a; 
      int * __restrict pB = b; 
      // nonaliased code runs on pA and pB
   }
}&lt;/pre&gt;
which trades the 40-cycle LHS for a 5-20 cycle branch stall.</description>
		<content:encoded><![CDATA[<p>Well in general I think you&#8217;re better off making an algorithmic decision up front about whether or not the pointers are allowed to be aliased at all. In many cases having aliased pointers to structures is a sign that something has gone wrong &#8212; consider what would happen inside (contrived example)</p>
<pre>void CrossProductNoStackCopy( Vector *out, Vector *in1, Vector *in2 )
{
  out-&gt;x = in1-&gt;y * in2-&gt;z - in1-&gt;z * in2-&gt;y ;
  out-&gt;y = in1-&gt;z * in2-&gt;x - in1-&gt;x * in2-&gt;z ;
  out-&gt;z = in1-&gt;x * in2-&gt;y - in1-&gt;y * in2-&gt;x ;
}</pre>
<p>if out and in1 were the same pointer. </p>
<p>But if you really really needed to make the determination at runtime I guess you could do something like</p>
<pre>void func(int *a, int *b)
{
   if ( a == b )
   {
      // aliased code
   }
   else
   {  // we know they are not aliased
      int * __restrict pA = a;
      int * __restrict pB = b;
      // nonaliased code runs on pA and pB
   }
}</pre>
<p>which trades the 40-cycle LHS for a 5-20 cycle branch stall.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Acetone.</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-35</link>
		<dc:creator>Acetone.</dc:creator>
		<pubDate>Tue, 28 Oct 2008 00:52:21 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-35</guid>
		<description>Thats a pretty good post.   Arseny, actually points out an interesting point, its good and bad too. I ran into exactly this issue.

Anyways, What I was wondering whether its possible to tell ( or modify )  the compiler to _assume_   pointers  are unaliased and insert  code containing  LHS inside the a branch ( which tests whether the pointers are aliased or not during runtime ).

why whould we need this ?

Obviously the profiling tools can point out the bits of code which are actually causing the pipeline stalls, but in some cases its not  possible to refactor or know whether we can _actually_ use the restrict keyword.

With failsafe code within a branch ( which can be predicted to be false most of the time ...) to care of the rare condition of aliasing, I think it will make the job of  profiling/re-coding/profiling..

Ofcourse, this also depends on the assumption that a branch would be less expensive than a stall...</description>
		<content:encoded><![CDATA[<p>Thats a pretty good post.   Arseny, actually points out an interesting point, its good and bad too. I ran into exactly this issue.</p>
<p>Anyways, What I was wondering whether its possible to tell ( or modify )  the compiler to _assume_   pointers  are unaliased and insert  code containing  LHS inside the a branch ( which tests whether the pointers are aliased or not during runtime ).</p>
<p>why whould we need this ?</p>
<p>Obviously the profiling tools can point out the bits of code which are actually causing the pipeline stalls, but in some cases its not  possible to refactor or know whether we can _actually_ use the restrict keyword.</p>
<p>With failsafe code within a branch ( which can be predicted to be false most of the time &#8230;) to care of the rare condition of aliasing, I think it will make the job of  profiling/re-coding/profiling..</p>
<p>Ofcourse, this also depends on the assumption that a branch would be less expensive than a stall&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ruskin</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-34</link>
		<dc:creator>Ruskin</dc:creator>
		<pubDate>Thu, 11 Sep 2008 19:35:45 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-34</guid>
		<description>Thanks RC! I was hoping they&#039;d be clear, it&#039;s a confusing subject. I&#039;m finally getting around to writing up all the stuff I couldn&#039;t squeeze into my GDC talk. =)</description>
		<content:encoded><![CDATA[<p>Thanks RC! I was hoping they&#8217;d be clear, it&#8217;s a confusing subject. I&#8217;m finally getting around to writing up all the stuff I couldn&#8217;t squeeze into my GDC talk. =)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: R Caloca</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-33</link>
		<dc:creator>R Caloca</dc:creator>
		<pubDate>Thu, 11 Sep 2008 17:53:35 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-33</guid>
		<description>Excellent posts! I was about to write about it too... http://rcaloca.blogspot.com/2008/09/beaten-on-restrict.html</description>
		<content:encoded><![CDATA[<p>Excellent posts! I was about to write about it too&#8230; <a href="http://rcaloca.blogspot.com/2008/09/beaten-on-restrict.html" rel="nofollow">http://rcaloca.blogspot.com/2008/09/beaten-on-restrict.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ruskin</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-32</link>
		<dc:creator>Ruskin</dc:creator>
		<pubDate>Thu, 11 Sep 2008 05:01:25 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-32</guid>
		<description>No, although that underscores the confusing nature of __restrict aliasing bugs. If you take a close look at MemberFunc:
&lt;blockquote&gt;
&lt;code&gt;int CFoo::MemberFuncWithRestrict( int * a, int x )
{
	m_iBar = x;
	*a = 7;
	return m_iBar + *a;
}&lt;/code&gt;
&lt;/blockquote&gt;

What&#039;s supposed to happen is that it sets &lt;code&gt;m_iBar&lt;/code&gt; to a number. then the contents of pointer &lt;code&gt;a&lt;/code&gt; to 7, and then adds &lt;code&gt;m_iBar&lt;/code&gt; to whatever&#039;s through pointer &lt;code&gt;a&lt;/code&gt;.  With parameters of &lt;code&gt;(&amp;m_iBar, 5)&lt;/code&gt;, what &lt;i&gt;should&lt;/i&gt; happen is that &lt;code&gt;*a&lt;/code&gt; and &lt;code&gt;m_iBar&lt;/code&gt; are the same thing, so &lt;code&gt;m_iBar&lt;/code&gt; should get set to &lt;b&gt;7&lt;/b&gt; at line two, and then return 7+7=14.  The compiler does this by reloading &lt;code&gt;m_iBar&lt;/code&gt; from memory after &lt;code&gt;*a&lt;/code&gt; = 7.

With __restrict, you have promised the compiler that &lt;code&gt;a&lt;/code&gt; &#8800; &lt;code&gt;&amp;m_iBar&lt;/code&gt;, so it does not reload &lt;code&gt;m_iBar&lt;/code&gt;. Instead, it assumes that &lt;code&gt;m_iBar&lt;/code&gt; is still 5 (and not 7) after the assignment to &lt;code&gt;*a&lt;/code&gt;, and returns 5+7 = 12.</description>
		<content:encoded><![CDATA[<p>No, although that underscores the confusing nature of __restrict aliasing bugs. If you take a close look at MemberFunc:</p>
<blockquote><p>
<code>int CFoo::MemberFuncWithRestrict( int * a, int x )<br />
{<br />
	m_iBar = x;<br />
	*a = 7;<br />
	return m_iBar + *a;<br />
}</code>
</p></blockquote>
<p>What&#8217;s supposed to happen is that it sets <code>m_iBar</code> to a number. then the contents of pointer <code>a</code> to 7, and then adds <code>m_iBar</code> to whatever&#8217;s through pointer <code>a</code>.  With parameters of <code>(&amp;m_iBar, 5)</code>, what <i>should</i> happen is that <code>*a</code> and <code>m_iBar</code> are the same thing, so <code>m_iBar</code> should get set to <b>7</b> at line two, and then return 7+7=14.  The compiler does this by reloading <code>m_iBar</code> from memory after <code>*a</code> = 7.</p>
<p>With __restrict, you have promised the compiler that <code>a</code> &ne; <code>&amp;m_iBar</code>, so it does not reload <code>m_iBar</code>. Instead, it assumes that <code>m_iBar</code> is still 5 (and not 7) after the assignment to <code>*a</code>, and returns 5+7 = 12.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: B</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-31</link>
		<dc:creator>B</dc:creator>
		<pubDate>Thu, 11 Sep 2008 04:36:44 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-31</guid>
		<description>&lt;blockquote&gt;you will get incorrect results, in this case a return value of 12 instead of 14.&lt;/blockquote&gt;

Did you mean

you will get incorrect results, in this case a return value of 14 instead of 12.</description>
		<content:encoded><![CDATA[<blockquote><p>you will get incorrect results, in this case a return value of 12 instead of 14.</p></blockquote>
<p>Did you mean</p>
<p>you will get incorrect results, in this case a return value of 14 instead of 12.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Arseny Kapoulkine</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-30</link>
		<dc:creator>Arseny Kapoulkine</dc:creator>
		<pubDate>Sun, 07 Sep 2008 05:20:50 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-30</guid>
		<description>Yeah, there seems to be no strict aliasing in MSVC. It was both a good and bad thing for us - good, because of additional optimizations when switching to GCC (PS3), bad, because of additional PS3-only bugs...</description>
		<content:encoded><![CDATA[<p>Yeah, there seems to be no strict aliasing in MSVC. It was both a good and bad thing for us &#8211; good, because of additional optimizations when switching to GCC (PS3), bad, because of additional PS3-only bugs&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ruskin</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-28</link>
		<dc:creator>Ruskin</dc:creator>
		<pubDate>Sat, 06 Sep 2008 20:03:53 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-28</guid>
		<description>That&#039;s what I thought, too, until I actually compiled the function and looked at the output, which is what you see here. Apparently the Microsoft compiler doesn&#039;t enable strict aliasing. Another example of the vast chasm between what the compiler &lt;em&gt;should&lt;/em&gt; do and what it &lt;em&gt;does&lt;/em&gt; do.

In any case having ints and floats aliased like this is a contrived example and rather silly. The more likely case is two pointers of the same type.</description>
		<content:encoded><![CDATA[<p>That&#8217;s what I thought, too, until I actually compiled the function and looked at the output, which is what you see here. Apparently the Microsoft compiler doesn&#8217;t enable strict aliasing. Another example of the vast chasm between what the compiler <em>should</em> do and what it <em>does</em> do.</p>
<p>In any case having ints and floats aliased like this is a contrived example and rather silly. The more likely case is two pointers of the same type.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Arseny Kapoulkine</title>
		<link>http://assemblyrequired.crashworks.org/2008/09/06/more-on-__restrict/comment-page-1/#comment-29</link>
		<dc:creator>Arseny Kapoulkine</dc:creator>
		<pubDate>Sat, 06 Sep 2008 19:50:16 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.wordpress.com/?p=14#comment-29</guid>
		<description>As far as I understand, the example &#039;slow&#039; function is kinda wrong.

Compiler can assume that a and b don&#039;t alias, because if they do alias, it&#039;s a violation of strict aliasing rules (the calling code with (float*)(foo + 1) contains this violation). Whether the compiler actually assumes it of course depends on implementation and optimization settings.</description>
		<content:encoded><![CDATA[<p>As far as I understand, the example &#8216;slow&#8217; function is kinda wrong.</p>
<p>Compiler can assume that a and b don&#8217;t alias, because if they do alias, it&#8217;s a violation of strict aliasing rules (the calling code with (float*)(foo + 1) contains this violation). Whether the compiler actually assumes it of course depends on implementation and optimization settings.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

