Porting Voxlap

Sonarpulse · Unread post by **Sonarpulse** » Tue Jan 01, 2013 7:53 pm

Iceball a completely rewrite, with a different server protocol. This for now is just me porting voxlap so things like 64-bit builds (no you won't see much benefit besides ease of install and compile), linux, ARM, Mac OS X are possible. [Not that these things are also possible with Iceball right now.]

I recently hit a critical point in my porting so that the the stuff being rendered actually appears on the screen, albeit it bit messed up and prone to crash. Now I can visually check the correctness of my port, not just compare dissasembly and cross my fingers.

Once that is done, yes I'd like to make a straight up clone of 0.75, compatible with the existing servers and all that. Technically work on that can start right now, but as most of the other devs are on linux, waiting for my port to finish will be easier.

Technically, Voxlap could be substituted for Iceball's renderer, but that would take a lot of work, and isn't legally possible due to licensing conflicts mentioned above.

VladVP · Unread post by **VladVP** » Wed Jan 02, 2013 11:59 am

I see you guys are having trouble with porting assembly, eh?
Maybe I could help... I don't know much MASM, but I have been programming in NASM for a couple of years now...

Mind if I take a look at v5.asm that you're having this much trouble with?

Sonarpulse · Unread post by **Sonarpulse** » Wed Jan 02, 2013 5:23 pm

Sure! I'll love all the help I can get!

v5.nasm actually works perfectly, but if we are going to do a 64-bit build, drawboundcubesse in v5.nasm needs to be rewritten in C with http://gcc.gnu.org/onlinedocs/gcc/Vecto ... sions.html.

There are other things in v5.nasm too, but I believe undefining USEV5ASM in voxlap will cause them to be taken care of.

The thing I am working on ATM is the gcc extended inline asm. You can look up extended asm, it's pretty whack. :D

VladVP · Unread post by **VladVP** » Wed Jan 02, 2013 10:12 pm

Well.. yeah... besides some questionable section directives, I can't really blame your nasm port of v5.asm. My conclusion was that the syntax was flawless, and my assembler said the exact same. At least for the syntax! I'm not completely sure what parts of the code does what, so I'm not sure why you would rewrite anything in C.

Sonarpulse · Unread post by **Sonarpulse** » Thu Jan 03, 2013 5:21 am

Hmm what looked iffy? I only said perfect because it looked right and I tried it with MSVC and it worked perfectly.

As far as getting anything to run smoothly on linux, I don't think it needs any more work. But I'd love to see this work on other processors too, not just with other compilers. Therefore I would like to eventually rewrite it with those architecture-independent built-ins.

If you rather work on the more "urgent" task, getting x86-32 gcc builds to work bug-free, then I suggest you help me get the gcc extended inline assembly in voxlap5.cpp and sdlmain.cpp bug free.

Either the inline assembly can be fixed or it can be replaced with C. The master branch is conservative and uses inline assembly everywhere the original does. The no_asm branch attempts pure C, or at least architecture neutral code like those gcc vector built ins. I'll probably create more branches as more options present themselves.

VladVP · Unread post by **VladVP** » Thu Jan 03, 2013 8:23 pm

Well... there weren't really anyhting wrong; just the way you wrote sections:

Code: Select all

SEGMENT .text

for example. It's perfectly fine, but the "ususal" way of doing it is just either

Code: Select all

SECTION .text

or

Code: Select all

SEGMENT _TEXT

but as far as I'm aware the two things are completely synonymous, and you can mix and match section/segment and .text/_text as you please.

And well yeah; I took a look at voxlap5.cpp in the main branch (the one with ASM, I assume), and tried building it with GCC. Surprisingly enough, it only returned two errors and a few unimportant warnings; I have only taken a look at the first error which were found in the function

Code: Select all

static inline void mmxcoloradd (long *a)
{
	#ifdef __NOASM__
	#endif
	#if defined(__GNUC__) && !defined(__NOASM__) //AT&T SYNTAX ASSEMBLY
	__asm__ __volatile__
	(
		".intel_syntax noprefix
		psubusb	%[a], flashbrival
		.att_syntax prefix"
		: [a] "=y" (*a)
		:      "0" (*a)
		:
	);
	#endif
	#if defined(_MSC_VER) && !defined(__NOASM__) //MASM SYNTAX ASSEMBLY
	_asm
	{
		mov	eax, a
		movd	mm0, [eax]
		psubusb	mm0, flashbrival
		movd	[eax], mm0
	}
	#endif
}

Sooo... the first error I noticed was not something that my compiler found, but was actually in the "#if defined(_MSC_VER)" preprocessor directive. You may already know this, but let me just point out that the microsoft inline assembly... let's call it "intiator"... is

Code: Select all

__asm { test ax, ax }

instead of

Code: Select all

_asm { test ax, ax }

so you need two and not one underscore for the MSC version of inline assembly.

The second thing is in "defined(__GNUC__)". The

Code: Select all

.intel_syntax noprefix

directive actually confused me a bit because the assembly code directly requested the C(++) variable "flashbrival", which we all know you can not do in AT&T syntax inline assembly, but which you apparently can if you use the above directive.
But that just confused me more, because if we could just request variables directly like in MSC, then what's the input "section" of __asm__ for??
Confusion like for example this made me conclude that it would be much easier to just write the AT&T inline assembly in AT&T as it is supposed to. I have taken some freedom to write this quick inline asm code which is a mere direct translation of the MSC code:

Code: Select all

	__asm__ __volatile__
	(
		"movd (%0), %mm0; //set mm0 to *a
		psubusb %1, %mm0; //perform the MMX psubsb instruction between flashbrival and mm0
		movd %mm0, (%0);" //move mm0 back into *a
		: /* the code aren't really supposed to output anything, is it? */
		: "r" (a), "r" (flashbrival)
		/* we don't have to specify the last "section" of __asm__ */
	);

I'm not sure if it's going to work but I can ensure you that it's practically a direct copy of the other MSC-based inline assembly, just in AT&T. At least as far as what I have learned about inline AT&T assembly...

Meh, I'll just post my progress here to see if you guys have anything that prooves this wrong

Sonarpulse · Unread post by **Sonarpulse** » Thu Jan 03, 2013 9:16 pm

Ah! excellent points. Do you want to go on #aos.development on quakenet?

Good to know about proper "form" with nasm. Thanks for the tip. Yes the master branch is where the ASM is. This particular function is used both voxlap itself and voxed, therefore I am considering moving it and "flashbrival" into voxlap.h

Precisely: in MSVC you can directly reference variables by their C(++) names and it will take care of everything for you, but with gcc you only do that with C by taking advantage of the fact that symbol names are not mangled. Then of course you also give up the ability to move things into a register. Or use "g", signifying the compiler may cache the value in a register or pass the address to the value in memory, the most flexible option and one I should probably use more.

Until recently I thus had a problem when I needed to reference arrays or other data types I couldn't fit into a register. But now I have the solution. Checkout fpuinit and expandbit256 in sdlmain and voxlap respectively. For the link-time constants such as the adresses of global variables, you can use the "p" constraint. then you can use %[temp_name] to get $address or the basically undocumented %c[temp_name] to get address. I gather the "c" in "%c" stands for contents, as using "%c" will deference the address given for [temp_name]. However, in what might be viewed as a peculiarity in at&t syntax, offsets in "offset(base_register, index_register, scaling_factor)" are NOT prefixed with a "$". So you would do "0x5F(%edx,%ecx,4)" NOT "$0x5F(%edx,%ecx,4)". This means when the offset is something from the input list that you are SURE will become a build-time constant (not be passed in as a register), use %c[asdf] to pass in the static address itself, not the contents at that address. I guess if since the whole line is a deference, the %c makes some sense, but it took me a while to figure this out.

I could just pass -mintel to gcc, instructing it to output intel syntax. But it's probably just easier to just switch to At&t when you need to used a something from the input/output list as I have done in expandbit256. Of course for something like "mmxcoloradd" that means rewriting the whole thingin at&t, as it does for the functions in ksnippits.h, but for the bigger chunks with lots of clobbers we can keep the bulk of it in intel syntax to save effort.

EDITS:
if the reference to "flashbrival" worked, it's' because I used my FORCE_NAME macro to override gcc's choice for a symbol name for it. I always hated that solution, and not that I found about about "p" and %c I am determined NOT to use it.

Also, are you on windows? my basic strategy is to make things (besides v5 as that will take more work) C so I can be sure MSVC and GCC are working from the same exact code and fix bugs that way. Then I plan on making a temp branch and undoing each commit turning master into no_asm to debug the gcc inline assembly one group at a time. Of course sometimes Ken's C itself is buggy, as it is in this commit: https://github.com/Ericson2314/Voxlap/c ... 05943dfa40

Lastly, I see you have a github. I stared it. Beware though that I also rebase no_asm off the last commit of master. Now that you are working on this too I will refrain from destructively updating master however.

That link is going to go dead when I rebase no_asm, so here's the commit name: " second C alternative for 'updatereflects' now in use"

Sonarpulse · Unread post by **Sonarpulse** » Fri Jan 04, 2013 1:11 am

OK, among other things I moved mmxcolor* into a header file and did them in at&t inline asm. I am hoping the compiler will be smart enough to do the MOVLs by itself, but if not I will put them in explicitly your way (and use an input of

Code: Select all

[a] "g" (*a)

which should work).

Also I thought more what steps should be taken to complete this, and have come up with 4 stages each of which can correspond to a branch in git. Master and no_asm clearly are not enough. Also I should probably just forget about the "__NOASM__" pre-processor macro I have no, and just relay on branches. Here are the steps/branches:

Conservative: This is what master is now. In short, every block of MSVC inline asm has a block of gcc inline asm to go with it. Nothing else semantically should be changed. I doubt this will work, as sometimes Voxlap used persistent registers.
No_CPUtype_dispatch: Currently Voxlap actually contains both SSE2 and 3DN inline assembly, and uses some run-time stuff to chose which routines to use. This is really weird, and makes replacing with C slightly more complicated as some functions like "movps_3dn" are no longer used. I should remove the "cputype" routine, and make the choice of architecture compliantly compile time. This should be rebased off conservative.
no_asm: This is no_asm, a rush to get something that works and also uses as little compiler-specific code as possible. I have used ken's C alternatives even when vector built-ins would be more efficient, or said alternatives don't even work right (kv6draw). all the point4d vector arithmetic functions should be moved to a separate header. This should be rebased off No_CPUtype_dispatch.
SIMD: This is where I want to end up. This will be completely architecture independent. Pure C is used where appropriate, but if something is better done with SIMD/vector built-ins I use those instead. By the time I get to this, no_asm should be bug free. And if I haven’t bothered to redo v5.?asm in no_asm, I should re-do it now with vector-built-ins. Probably a lot of the little inline functions used for assembly can be manual inlined and discarded at this point. I may drop MSVC support here, as I don't really want to have to redo all the SIMD stuff for two compilers. Also I might do the minimal pre-proccesor stuff needed to allow for building as Pure C if I have not done so already (this involves making a few static struct members into global variables, so it will change the API). This should be rebased off No_CPUtype_dispatch, not no_asm.

Cajun Style · Unread post by **Cajun Style** » Fri Jan 04, 2013 12:12 pm

As I said on IRC: good to see more structure. Also very nice to have somebody else who understands assembly.
It is kind of noobish, but I'm totally stumped by these expressions:

Code: Select all

void func(long x, ...){
[...]
*(long *)(x+ 4) = fcol;

It seems that I've interpreted them the only way that they make sense, and it is correct:

Code: Select all

*((long *)(x+ 4)) = fcol;

This is freaking overrun paradise, and if I try to compile a stand-alone program doing this, it crashes. Is it because in DOS you could assume address 0 is the framebuffer?! (In MSVC... too???) Otherwise this makes NO SENSE AT ALL. NVM there is this global named "frameplace" designating the buffer location...

Sonarpulse · Unread post by **Sonarpulse** » Fri Jan 04, 2013 5:51 pm

Ouch, that is ugly. The reason this works is because if you look at the for loop, x is set to y, and y depends on some sort of framebuffer offset (and the original, parameter x). So yes, it's mutating your parameter which nobody likes, but it does work.

Code: Select all

	//here is parameter x
	y = y*bytesperline+(x<<2)+frameplace;
	if (bcol < 0)
	{
		for(j=20;j>=0;y+=bytesperline,j-=4
			for(c=st,x=y;*c;c++,x+=16) //x is set to y

I should probably add a commit with some comments about this. Voxlap is already sorely lacking in comments.

VladVP · Unread post by **VladVP** » Sun Jan 06, 2013 7:33 pm

I just tried building voxlap5.c(pp) with GCC, with all files in "/include", and "/sources" in the same directory. It returned these errors and warnings:

Code: Select all

C:\Users\VladVP\Desktop\voxmerge\voxlap5.cpp|10105|warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
C:\Users\VladVP\Desktop\voxmerge\voxlap5.cpp|10105|warning: this decimal constant is unsigned only in ISO C90 [enabled by default]
C:\Users\VladVP\Desktop\voxmerge\voxflash.h |     |In function 'void mmxcoloradd(long int*)':
C:\Users\VladVP\Desktop\voxmerge\voxflash.h |46   |error: impossible constraint in 'asm'
C:\Users\VladVP\Desktop\voxmerge\voxlap5.cpp|     |In function 'void voxsetframebuffer(long int, long int, long int, long int)':
C:\Users\VladVP\Desktop\voxmerge\voxlap5.cpp|13702|error: unknown register name 'mm0' in 'asm'
||=== Build finished: 2 errors, 2 warnings (0 minutes, 1 seconds) ===

As you can see, the warnings are pretty much harmless, but here is the code that generated the first error:

Code: Select all

	__asm__ __volatile__
	(
		"paddusb	%c[fb], %[a]\n"
		: [a] "+y" (*((uint32_t*)a))
		: [fb] "p" (&flashbrival)
		:
	);

And the code that generated the second error:

Code: Select all

__asm__ __volatile__
			(
				".intel_syntax noprefix\n"
				"xor	eax, eax\n"
				"mov	ecx, -2048*8\n"
			".Lfogbeg:\n"
				"movd	mm0, eax\n"
				"add	eax, edx\n"
				"jo	short .Lfogend\n"
				"pshufw	mm0, mm0, 0x55\n"
				"movq	foglut[ecx+2048*8], mm0\n"
				"add	ecx, 8\n"
				"js	short .Lfogbeg\n"
				"jmp	short .Lfogend2\n"
			".Lfogend:\n"
				"movq	mm0, all32767\n"
			".Lfogbeg2:\n"
				"movq	foglut[ecx+2048*8], mm0\n"
				"add	ecx, 8\n"
				"js	short .Lfogbeg2\n"
			".Lfogend2:\n"
				"emms\n"
				".att_syntax prefix\n"
				:
				: [i] "d" (i)
				: "eax", "ecx", "mm0"
			);

So the first thing that comes into my mind is how "\n" is used at the end of every instruction call. From what I have learned about inline AT&T ASM, the way it's done is simply with a semicolon (";") and as far as I know, you also don't have to bind multiple strings together like that, but simply place an initial quotation mark on one line, and a terminating one on another line.
Although both methods seem to work fine without any obvious errors, I still think we need some kind of "standard" way of doing this, so that the code won't look too messy...

Secondly, could someone please explain to me what those square brackets are doing in the AT&T ASM? Their use is either extremely undcoumented, or I have skipped something when I read through the tutorials... (but honestly, I went back to the tutorials and Control-F'd both "[" and "]", but found only Intel syntax examples.

Thirdly, i though we all agreed on not using ".intel_syntax"? Or is that just because no one has fixed that yet?... If so, then excuse me.

I just felt that before I make any changes to my own repo and make pull requests and stuff, I thought that I would ask you guys about all of this first...

PS: Commenting the ASM block out does actually fix the error in voxflash.h, and there doesn't seem to be any errors with the other ASM block in the file... it's actually very strange, because the only difference between the two is the "psubusb" and "paddusb" MMX instructions...

Sonarpulse · Unread post by **Sonarpulse** » Sun Jan 06, 2013 8:18 pm

I only learned about the semicolons recently. doing that, and one multi-line string does sound a lot more readable.

the square brackets are used to reference constraints with a name, so you don't have to do %0 %1 %2 which is harder to read. it works like this:

Code: Select all

: [name] "constraint" (C expression)

The names have nothing to do with C variables or asm symbols, so you can use whatever. (it's local to the __asm__ block too.)

extended asm is poorly documented but I beleive this part actually is, it's just newer. the %c[name] thing is NOT documented. but I describe it in my previous post here: http://buildandshoot.com/viewtopic.php? ... t=50#p6921

You are correct in that I haven't really tested conservative (the branch you are compiling I think), as I still get errors building it. However I don't get the errors you mentioned. They make me think that you are doing a 64-bit compile, as mmx registers such as "mm0" are deprecated in x86-64.

What I wrote up a above is that .intel_syntax noprefix is safe for any block that doesn't used any imputs and outputs specified in the constraints. This is because this code is just handed off to the assembly verbatim. However whenever we use the constrained I/O stuff (ie things in square brackets)

Sonarpulse · Unread post by **Sonarpulse** » Mon Jan 07, 2013 2:53 am

Well, I am working on converting the last bit in inline assembly in voxlap5.cpp, expandbit256, to C. It looks pretty good but it still causes the test game to segfault immediately so if anybody wants to fix it that would be great!

Here is what I've got: https://pastee.org/xhr32

VladVP · Unread post by **VladVP** » Mon Jan 07, 2013 8:46 pm

Sonarpulse wrote:
I only learned about the semicolons recently. doing that, and one multi-line string does sound a lot more readable.

Does that mean we should just use semicolons and multiline strings from now on?

Sonarpulse wrote:
the square brackets are used to reference constraints with a name, so you don't have to do %0 %1 %2 which is harder to read. it works like this:
Code: Select all
: [name] "constraint" (C expression)
The names have nothing to do with C variables or asm symbols, so you can use whatever. (it's local to the __asm__ block too.)

Oh... now your code makes MUCH more sense... xD

Sonarpulse wrote:
extended asm is poorly documented but I beleive this part actually is, it's just newer. the %c[name] thing is NOT documented. but I describe it in my previous post here: http://buildandshoot.com/viewtopic.php? ... t=50#p6921

So... if I understand it right, %c[name] dereferences those [name] references? Then what about round brackets?

Sonarpulse wrote:
You are correct in that I haven't really tested conservative (the branch you are compiling I think), as I still get errors building it. However I don't get the errors you mentioned. They make me think that you are doing a 64-bit compile, as mmx registers such as "mm0" are deprecated in x86-64.

Yes, I am mainly focusing on conservative right now as it is the first "step", and it seems unfinished. Oh, and DUMBASS I am indeed... I completely forgot that I am running on a 64-bit system... Sorry for my stupidity, I got a brand new PC for christmas, but I'm not used to it's x64 programming environement yet...
And regarding the MMX registers, I've read that they aren't completely deprecated in x86-64 as they are still supported. Where have you found out that they were deprecated in the first place by the way?
Oh, well... I think I'll try to find a way to configure GCC to make 32-bit compilations in Code::Blocks now...

Sonarpulse wrote:
What I wrote up a above is that .intel_syntax noprefix is safe for any block that doesn't used any imputs and outputs specified in the constraints. This is because this code is just handed off to the assembly verbatim. However whenever we use the constrained I/O stuff (ie things in square brackets)

K.

Sonarpulse wrote:
Well, I am working on converting the last bit in inline assembly in voxlap5.cpp, expandbit256, to C. It looks pretty good but it still causes the test game to segfault immediately so if anybody wants to fix it that would be great!

Here is what I've got: https://pastee.org/xhr32

Hmm... looks complicated... I'll try taking a look at it tomorrow...

Sonarpulse · Unread post by **Sonarpulse** » Tue Jan 08, 2013 12:12 am

VladVP wrote:
Does that mean we should just use semicolons and multiline strings from now on?

Yeah, I think that would be easier. We should probably convert what we have now too.

VladVP wrote:
Oh... now your code makes MUCH more sense... xD

Glad that helps.

VladVP wrote:
So... if I understand it right, %c[name] dereferences those [name] references? Then what about round brackets?

Here's the thing, the very new "%c" tells gcc to deference [name] value. The parenthesis are at&t syntax way of dereferencing the assembly: gcc just passes it to the assembler verbatim. If it weren't for the fact that I knew because of the "p" constraint that the value given (the address of a global variable) would have to be resolved to a (link-time) constant and thus a number with new dollar prefix, I would probably end up with two sets of parenthesis verbatim.

important to note: when one uses a constraint such as "m", the value is referenced automatically, and then dereferenced automatically so you need neither %c, nor & in the C expression. With "r" it likewise substitutes the value for the register name, which is also sort of like referencing except register names are "dereferenced" back to their contents automatically. With "p" it DOESN'T automatically reference the value, which is why I have to do that in the C expression.

VladVP wrote:
Yes, I am mainly focusing on conservative right now as it is the first "step", and it seems unfinished.

Here's the thing, I don't really know whether it's faster to hack some C that works with both compilers and then fix the gcc assembly, or vice versa, which why I see no_asm and conservative as the first step, and do a hell of a lot of rebasing in git. Certainly working on either branch will help the other one, so do whatever you think is best.

On a somewhat related, when I emailed the gcc help thread, they suggested that I make a test_main, so to speak. So far I have just done been doing static code comparisons, but especially if we are going to try to get conservative working first, some dynamic testing would go along way. It will be a pain to run the MSVC and build up tables of values, but if we buckle down and do it I think we'll be glad we did.

VladVP wrote:
I got a brand new PC for christmas, but I'm not used to it's x64 programming environement yet...

Eh, don't worry about it. None of the errors made that problem super obvious. And for that matter I am suspicious my mmxcolor* gcc inline asm might be silently failing on 32 bit too.

VladVP wrote:
And regarding the MMX registers, I've read that they aren't completely deprecated in x86-64 as they are still supported. Where have you found out that they were deprecated in the first place by the way?

No idea, it was a while ago and I just stumbled on it.

VladVP wrote:
Hmm... looks complicated... I'll try taking a look at it tomorrow...

Great. GreaseMonkey said that it has do with unpacking vxls, which matches what little I can gleam from it. Some of It I can easily turn into a do-while loop, and there are a couple other things to make it more readable, but I left it as is for now so it can be compared side by side to the assembly (and hopefully the disassembly can be too.)

Build and Shoot

Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Re: Porting Voxlap

Who is online