<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Why You Should Never Cast Floats To Ints</title>
	<atom:link href="http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/feed/" rel="self" type="application/rss+xml" />
	<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/</link>
	<description>Technical Notes On Game Development</description>
	<lastBuildDate>Wed, 21 Jul 2010 18:22:16 -0700</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: GameCoder.it &#8722; Il cast floatint</title>
		<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/comment-page-1/#comment-5220</link>
		<dc:creator>GameCoder.it &#8722; Il cast floatint</dc:creator>
		<pubDate>Thu, 01 Jul 2010 09:20:28 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.crashworks.org/?p=124#comment-5220</guid>
		<description>[...] why-you-should-never-cast-floats-to-ints [1] fast-floating-point-to-integer-conversions [...]</description>
		<content:encoded><![CDATA[<p>[...] why-you-should-never-cast-floats-to-ints [1] fast-floating-point-to-integer-conversions [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Elan</title>
		<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/comment-page-1/#comment-65</link>
		<dc:creator>Elan</dc:creator>
		<pubDate>Wed, 14 Jan 2009 01:02:39 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.crashworks.org/?p=124#comment-65</guid>
		<description>Thanks for the clarification, cb! I&#039;ve corrected that passage.</description>
		<content:encoded><![CDATA[<p>Thanks for the clarification, cb! I&#8217;ve corrected that passage.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cb</title>
		<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/comment-page-1/#comment-64</link>
		<dc:creator>cb</dc:creator>
		<pubDate>Wed, 14 Jan 2009 00:49:26 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.crashworks.org/?p=124#comment-64</guid>
		<description>I&#039;m a P3 and get numbers about the same as the stereopsis numbers (eg. the xs addition trick is slightly faster than fistp).  My objection was to this part :

&quot;Furthermore, most of the magic-number tricks involve an integer step on the floating-point number, which is disastrously slow because of the way the registers are partitioned&quot;

which is true on consoles but not on x86.  The big win of using the magic number trick on PC&#039;s is that you can distribute code or put it in new projects and not worry about what compiler settings they&#039;re using.  Also the advantage over fistp is you don&#039;t have to worry about how the FPU rounding mode is set.

Note that the magic number truncate is really slow, though, only the magic number round-to-int is fast.</description>
		<content:encoded><![CDATA[<p>I&#8217;m a P3 and get numbers about the same as the stereopsis numbers (eg. the xs addition trick is slightly faster than fistp).  My objection was to this part :</p>
<p>&#8220;Furthermore, most of the magic-number tricks involve an integer step on the floating-point number, which is disastrously slow because of the way the registers are partitioned&#8221;</p>
<p>which is true on consoles but not on x86.  The big win of using the magic number trick on PC&#8217;s is that you can distribute code or put it in new projects and not worry about what compiler settings they&#8217;re using.  Also the advantage over fistp is you don&#8217;t have to worry about how the FPU rounding mode is set.</p>
<p>Note that the magic number truncate is really slow, though, only the magic number round-to-int is fast.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Elan</title>
		<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/comment-page-1/#comment-60</link>
		<dc:creator>Elan</dc:creator>
		<pubDate>Tue, 13 Jan 2009 19:43:43 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.crashworks.org/?p=124#comment-60</guid>
		<description>Wow, MSVC fail:
&lt;blockquote&gt;error C3861: &#039;lrint&#039;: identifier not found&lt;/blockquote&gt;
=(</description>
		<content:encoded><![CDATA[<p>Wow, MSVC fail:</p>
<blockquote><p>error C3861: &#8216;lrint&#8217;: identifier not found</p></blockquote>
<p>=(</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: syskill</title>
		<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/comment-page-1/#comment-59</link>
		<dc:creator>syskill</dc:creator>
		<pubDate>Tue, 13 Jan 2009 18:53:17 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.crashworks.org/?p=124#comment-59</guid>
		<description>I did some digging into whether and how this is handled in my world; it seems that the trick to avoiding fldcw in GCC is to use the lrint() function from the C99 standard, and then compile with -ffast-math so that it will be inlined.

Maybe lrint() with /fp:fast will give the desired results in MSVC without resorting to deprecated switches?</description>
		<content:encoded><![CDATA[<p>I did some digging into whether and how this is handled in my world; it seems that the trick to avoiding fldcw in GCC is to use the lrint() function from the C99 standard, and then compile with -ffast-math so that it will be inlined.</p>
<p>Maybe lrint() with /fp:fast will give the desired results in MSVC without resorting to deprecated switches?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Elan</title>
		<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/comment-page-1/#comment-58</link>
		<dc:creator>Elan</dc:creator>
		<pubDate>Tue, 13 Jan 2009 00:00:01 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.crashworks.org/?p=124#comment-58</guid>
		<description>What hardware are you running, cb? I just tried a test that converts an array of 1024 floats 100,000 times (ie, performs 1.024&#215;10&lt;sup&gt;8&lt;/sup&gt; conversions while fitting in cache) and here are the results for my Intel Core2 @2.4ghz: 

&lt;center&gt;&lt;table width=&quot;90%&quot; cellpadding=&quot;4&quot; &gt;
&lt;tr&gt;&lt;td&gt;
 &lt;table border style=&quot;width:30em; text-align:right&quot; &gt;
   &lt;caption &gt;_ftol2_sse versus magic-number truncation&lt;br/&gt;(4 trials)&lt;/caption&gt;
&lt;tr style=&quot;text-align:center&quot;&gt;&lt;th&gt;_Ftol2_sse&lt;/th&gt;&lt;th&gt;magic number&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td &gt;313.932ms&lt;/td&gt;&lt;td style=&quot;align:right&quot; &gt;182.816ms&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;319.619ms&lt;/td&gt;&lt;td&gt;181.992ms&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;311.646ms&lt;/td&gt;&lt;td&gt;178.677ms&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;310.179ms&lt;/td&gt;&lt;td&gt;177.646ms&lt;/td&gt;&lt;/tr&gt;
 &lt;/table&gt;
&lt;/td&gt;&lt;td&gt;
&lt;table border  style=&quot;width:30em; text-align:right&quot;  &gt;
   &lt;caption &gt;fistp (via /QIfist) versus magic-number truncation&lt;br/&gt;(4 trials)&lt;/caption&gt;
&lt;tr style=&quot;text-align:center&quot;&gt;&lt;th&gt;&lt;tt&gt;fistp&lt;/tt&gt;&lt;/th&gt;&lt;th&gt;magic number&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;185.479ms&lt;/td&gt;&lt;td&gt;179.934ms&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;180.314ms&lt;/td&gt;&lt;td&gt;182.722ms&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;183.951ms&lt;/td&gt;&lt;td&gt;179.802ms&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;178.270ms&lt;/td&gt;&lt;td&gt;178.007ms&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;&lt;/center&gt;

It looks like the magic-number trick varies between 1% slower and 2% faster than native fistp, which is basically the same.

</description>
		<content:encoded><![CDATA[<p>What hardware are you running, cb? I just tried a test that converts an array of 1024 floats 100,000 times (ie, performs 1.024&times;10<sup>8</sup> conversions while fitting in cache) and here are the results for my Intel Core2 @2.4ghz: </p>
<p><center><br />
<table width="90%" cellpadding="4" >
<tr>
<td>
<table border style="width:30em; text-align:right" >
<caption>_ftol2_sse versus magic-number truncation<br />(4 trials)</caption>
<tr style="text-align:center">
<th>_Ftol2_sse</th>
<th>magic number</th>
</tr>
<tr>
<td>313.932ms</td>
<td style="align:right" >182.816ms</td>
</tr>
<tr>
<td>319.619ms</td>
<td>181.992ms</td>
</tr>
<tr>
<td>311.646ms</td>
<td>178.677ms</td>
</tr>
<tr>
<td>310.179ms</td>
<td>177.646ms</td>
</tr>
</table>
</td>
<td>
<table border  style="width:30em; text-align:right"  >
<caption>fistp (via /QIfist) versus magic-number truncation<br />(4 trials)</caption>
<tr style="text-align:center">
<th><tt>fistp</tt></th>
<th>magic number</th>
</tr>
<tr>
<td>185.479ms</td>
<td>179.934ms</td>
</tr>
<tr>
<td>180.314ms</td>
<td>182.722ms</td>
</tr>
<tr>
<td>183.951ms</td>
<td>179.802ms</td>
</tr>
<tr>
<td>178.270ms</td>
<td>178.007ms</td>
</tr>
</table>
</td>
</tr>
</table>
<p></center></p>
<p>It looks like the magic-number trick varies between 1% slower and 2% faster than native fistp, which is basically the same.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Elan</title>
		<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/comment-page-1/#comment-57</link>
		<dc:creator>Elan</dc:creator>
		<pubDate>Mon, 12 Jan 2009 19:55:32 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.crashworks.org/?p=124#comment-57</guid>
		<description>Hmm, let me test that again and if so I&#039;ll revise the article. I tried it once on my Core2, but it&#039;s possible the compiler may have actually optimized out my profile loop. Thanks for the note.</description>
		<content:encoded><![CDATA[<p>Hmm, let me test that again and if so I&#8217;ll revise the article. I tried it once on my Core2, but it&#8217;s possible the compiler may have actually optimized out my profile loop. Thanks for the note.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cb</title>
		<link>http://assemblyrequired.crashworks.org/2009/01/12/why-you-should-never-cast-floats-to-ints/comment-page-1/#comment-56</link>
		<dc:creator>cb</dc:creator>
		<pubDate>Mon, 12 Jan 2009 18:55:23 +0000</pubDate>
		<guid isPermaLink="false">http://assemblyrequired.crashworks.org/?p=124#comment-56</guid>
		<description>Good article, but this -

&quot;It’s also possible to convert floats to ints by adding them to a certain magic number, but usually this isn’t a benefit. In times of yore the fistp op was slow, so there was an advantage to replacing the fist with a fadd, but this hasn’t been the case since the Pentium 3. Furthermore, most of the magic-number tricks involve an integer step on the floating-point number, which is disastrously slow because of the way the registers are partitioned.&quot;

is just not true on x86 PC CPU&#039;s.  My measurements roughly match the Stereopsis XS measurements - the magic number based rounding is still faster than fist by a little bit, and it has the advantage of not relying on compiler settings or FPU rounding mode.  (of course on Xenon and Cell it&#039;s a different story)

note that&#039;s only true for round-to-int, the XS truncs are a little bit slower than fist or cvtt2sse.

Also the big advantage of the sse instructions is that you have truncate &amp; round both available at any time without changing modes.</description>
		<content:encoded><![CDATA[<p>Good article, but this -</p>
<p>&#8220;It’s also possible to convert floats to ints by adding them to a certain magic number, but usually this isn’t a benefit. In times of yore the fistp op was slow, so there was an advantage to replacing the fist with a fadd, but this hasn’t been the case since the Pentium 3. Furthermore, most of the magic-number tricks involve an integer step on the floating-point number, which is disastrously slow because of the way the registers are partitioned.&#8221;</p>
<p>is just not true on x86 PC CPU&#8217;s.  My measurements roughly match the Stereopsis XS measurements &#8211; the magic number based rounding is still faster than fist by a little bit, and it has the advantage of not relying on compiler settings or FPU rounding mode.  (of course on Xenon and Cell it&#8217;s a different story)</p>
<p>note that&#8217;s only true for round-to-int, the XS truncs are a little bit slower than fist or cvtt2sse.</p>
<p>Also the big advantage of the sse instructions is that you have truncate &amp; round both available at any time without changing modes.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
