Problem with string arrays in assembly

In summary: THEN ; ;
  • #1
maple23
15
0
I have been working with assembly (TASM32) for a few months now and have ran into a problem which I cannot fix. Here's a working example written in C++ which needs to be converted to assembly.
Code:
#include <windows.h>

int	main(){
	char	*name_list[5] = {"Micheal", "Stefan", "Judy", "William", "Lora"};
	for(int i = 0; i < 5; i++){
		MessageBox(0, name_list[i], name_list[i], 0);
	}
	return 0;
}
Here's the assembly version I've written. It goes through the five names fine, after the names, it brings up a message box with random characters.
Code:
.386
.model flat

EXTRN	MessageBoxA : PROC
EXTRN	ExitProcess : PROC

.DATA
	dd ?			; TASM gayness

.CODE
MAIN:
	pushad
	call	lblNames
		db "Micheal", 0
		db "Stefan", 0
		db "Judy", 0
		db "William", 0
		db "Lora", 0

lblNames:
	pop	esi		; esi = current name
	push	5		; 5 names
	pop	ecx		; ecx = counter

lblNameLoop:
	push	0
	push	esi
	push	esi
	push	0
	call	MessageBoxA

lblNextChar:
	lodsb
	test	al, al
	jnz	lblNextChar

	pop	ecx
	loop	lblNameLoop

	popad

	push	0
	call	ExitProcess
END	MAIN
Does anyone know what the problem is, or have any suggestions for me? This seems much more complicated than it should be.

Sorry for my English.

Thank you,
Stefan Kendrick
 
Technology news on Phys.org
  • #2
Why don't you just compiler the C++ code so that it produces assembly code
rather than an executable?
Usually for C this would be to use a -s or maybe -S option.
eg.
cc program.c -S
or something like that?
 
  • #3
Using the C compiler with /Fa switch would have helped. Note that the C compiler normally generates pointers to the strings and pushes these pointers onto the stack, (or in registers if using fastcall conventions), when calling a function:

Code:
        .DATA
string1	db "Micheal", 0
string2	db "Stefan", 0
string3	db "Judy", 0
string4	db "William", 0
string5	db "Lora", 0

        .CODE

        push    offset ds:string5
        push    offset ds:string4
        push    offset ds:string3
        push    offset ds:string2
        push    offset ds:string1
        call    lblNames
        add     esp,20

However you C code isn't calling a function, instead it will look something like this:

Code:
        .DATA
string1	db "Micheal", 0
string2	db "Stefan", 0
string3	db "Judy", 0
string4	db "William", 0
string5	db "Lora", 0

name_list struct
        dd      offset ds:string1
        dd      offset ds:string2
        dd      offset ds:string3
        dd      offset ds:string4
        dd      offset ds:string5
name_list ends
 
Last edited:
  • #4
After making the "name_list" STRUCT, how do I get the names from it? And why am I adding 20 to the ESP register?
 
  • #5
I was able to fix my code with 1 more line:

Code:
	pop	ecx		; ecx = counter

lblNameLoop:
	[b]push	ecx[/b]

	push	0
	push	esi
By adding PUSH ECX, it saves the value of ECX for the next time around.
 
  • #6
I am still interested in how to use your STRUCT function.
 
  • #7
How do I have two string arrays being searched through at the same time? C example:
Code:
#include <windows.h>

int	main(){
	char	*name_list[5] = {"Micheal", "Stefan", "Judy", "William", "Lora"};
	char	*color_list[4] = {"red?", "orange?", "blue?", "green?"};
	char	szFullMSG[100] = "";

	for(int i = 0; i < 5; i++){
		for(int x = 0; x < 4; x++){
			lstrcpy(szFullMSG, "Hello, ");
			lstrcat(szFullMSG, name_list[i]);
			lstrcat(szFullMSG, ". Is your favorite color ");
			lstrcat(szFullMSG, color_list[x]);
			MessageBox(0, szFullMSG, szFullMSG, 0);
		}
	}

	return 0;
}
 
  • #8
maple23 said:
After making the "name_list" STRUCT, how do I get the names from it? And why am I adding 20 to the ESP register?

The "add esp,20" is used to restore the stack which was decremented by 4 (assuming 4 byte or 32 bit pointers here) for each of the 5 pushes before the call for a total of 20 bytes. This is the C convention. With the Pascal convention, the number of operands to a function is known in advance, so the function does a "ret 20" to return and restore the ESP. Using a structure is awkward, in this case, just having an array of ptrs to strings is better. Note sizeof(dword) could be used instead of the 4 on the line beginning with main0:

main0: mov esi,namep[ecx*sizeof(dword)] ;esi = ptr to a name string

Code:
        title   x
;-----------------------------------------------------------------------;
;       x.asm           Example assembly program                        ;
;-----------------------------------------------------------------------;
        .386p
        .model  FLAT
;       include C library
        IF              @Version EQ 611
        INCLUDELIB      LIBC
        ELSE
        INCLUDELIB      MSVCRTD
        INCLUDELIB      OLDNAMES
        ENDIF
;-----------------------------------------------------------------------;
;       data                                                            ;
;-----------------------------------------------------------------------;
_DATA   segment
name1   db      'Micheal',0             ;names
name2   db      'Stefan',0
name3   db      'Judy',0
name4   db      'William',0
name5   db      'Lora',0

namep   dd      offset flat:name1
        dd      offset flat:name2
        dd      offset flat:name3
        dd      offset flat:name4
        dd      offset flat:name5

nmfmt   db      '%s',00dh,00ah,000h     ;format string for printf

_DATA   ends

_BSS    segment
_BSS    ends
;-----------------------------------------------------------------------;
;       code    cs=code segment  ds=flat                                ;
;-----------------------------------------------------------------------;
_TEXT   segment use32 public 'CODE'
        extrn   _printf:NEAR            ;declare external for printf
_main   proc    near
        push    ds                      ;es=ds
        pop     es
        mov     ecx,0                   ;ecx = index to names
main0:  mov     esi,namep[ecx*4]        ;esi = ptr to a name string
        push    ecx                     ;save ecx
        push    esi                     ;display name
        push    offset flat:nmfmt
        call    _printf
        add     esp,8
        pop     ecx                     ;restore ecx
        inc     ecx                     ;loop till done
        cmp     ecx,5
        jb      main0
        xor     eax,eax                 ;exit with 0
        ret
_main   endp
_TEXT   ends
        end

Structures don't allocate memory, if you want to use a structure, you define it and then declare one or more instances of it.

Code:
NAMEPS  STRUCT
        dd      1 dup (?)
        dd      1 dup (?)
        dd      1 dup (?)
        dd      1 dup (?)
        dd      1 dup (?)
NAMEPS  ENDS

namep   NAMEPS  {\
                {offset flat:name1},\
                {offset flat:name2},\
                {offset flat:name3},\
                {offset flat:name4},\
                {offset flat:name5}}

;       requires override of namep to a ptr to dword:

main0:  mov     esi,dword ptr namep[ecx*4] ;esi = ptr to a name string
 
Last edited:
  • #9
I still cannot get the code to work. Here is my TASM conversion:
Code:
.386
.model flat

extrn MessageBoxA:proc

NAMEPS	STRUCT
	dd	1 dup (?)
	dd	1 dup (?)
	dd	1 dup (?)
	dd	1 dup (?)
	dd	1 dup (?)
NAMEPS	ENDS

.DATA
	name1	db	"Micheal", 0
	name2	db	"Stefan", 0
	name3	db	"Judy", 0
	name4	db	"William", 0
	name5	db	"Lora", 0
	names	dd	offset name1
		dd	offset name2
		dd	offset name3
		dd	offset name4
		dd	offset name5

.CODE
MAIN:

	push	ds			;
	pop	es			;es = ds
	mov	ecx, 0			;ecx = counter
nameLoop:
	mov	esi, names[ecx * 4]	;ESI = pointer to name
	push	ecx			;save ecx

	push	0
	push	esi
	push	esi
	push	0
	call	MessageBoxA		;display name

	add	esp, 8
	pop	ecx			;restore ecx
	inc	ecx			;
	cmp	ecx, 5			;loop till done
	jb	nameLoop

	xor	eax, eax		;eax = 0	
	ret

END	MAIN
Do you know what is wrong?
 
  • #10
Thanks you for your help.
 
  • #11
maple23 said:
I still cannot get the code to work. Here is my TASM conversion:
Code:
	names	dd	offset name1
Do you know what is wrong?

"offset" should be "offset flat:" or possibly "offset ds:"
 
  • #12
Code:
	push	0
	push	esi
	push	esi
	push	0
	call	MessageBoxA		;display name
	add	esp, 8
Just looked at this again, the "add esp,8" needs to be "add exp,16" to compensate for the 4 pushes done (16 bytes).
 
  • #13
I still get the same error. The program gives a message box with the first name and then closes unexpectedly.
 
  • #14
My fault, WINAPI functions use pascal calling conventions. Remove the "add esp,16" after the call to "MessageBoxA". Note that non-windows C functions, like malloc() and printf() use "C" calling conventions which require the caller to restore ESP. On my system, using Visual Studio 2005, I have to use __imp__MessageBoxA@16, which is a MessageBox function with argument size of 16 bytes, so it does a "ret 16" when returning, so the call doesn't have to do a "add esp,16" afterwards:

Code:
        push    ecx                     ;save ecx
        push    0                       ;display message box
        push    esi
        push    esi
        push    0
        call    __imp__MessageBoxA@16
        pop     ecx                     ;restore ecx
 
Last edited:
  • #15
Thank you! :smile: The code works perfectly!
Code:
.386
.model flat

extrn MessageBoxA:proc

.DATA
	name1	db	"Micheal", 0
	name2	db	"Stefan", 0
	name3	db	"Judy", 0
	name4	db	"William", 0
	name5	db	"Lora", 0
	names	dd	offset ds:name1
		dd	offset ds:name2
		dd	offset ds:name3
		dd	offset ds:name4
		dd	offset ds:name5

.CODE
MAIN:

	push	ds			;
	pop	es			;es = ds
	mov	ecx, 0
nameLoop:
	mov	esi, names[ecx * 4]	;ESI = pointer to name
	push	ecx			;save ecx

	push	0
	push	esi
	push	esi
	push	0
	call	MessageBoxA		;display name

	pop	ecx			;restore ecx
	inc	ecx			;
	cmp	ecx, 5			;loop till done
	jb	nameLoop

	xor	eax, eax		;eax = 0	
	ret

END	MAIN
 

1. What is a string array in assembly?

A string array in assembly is a data structure that stores multiple strings in sequential memory locations. Each string is represented as a series of characters, with a null terminator at the end. The array is typically defined as a list of memory addresses pointing to the start of each string.

2. How do you declare and initialize a string array in assembly?

To declare and initialize a string array in assembly, you first need to reserve memory space for the array using the ALLOC or RESB directives. Then, you can use the DB directive to define each string, with a null terminator at the end. Finally, you can use the DW directive to store the memory addresses of each string in the array.

3. How do you access and manipulate individual strings in a string array?

To access and manipulate individual strings in a string array, you can use the index notation to refer to a specific string in the array. For example, the first string in the array would have an index of 0, the second string would have an index of 1, and so on. You can then use string manipulation instructions, such as MOVSB and LODSB, to perform operations on the string.

4. What are some common problems that can occur with string arrays in assembly?

Some common problems with string arrays in assembly include accessing an invalid index, not properly terminating a string with a null character, and not reserving enough memory space for the array. Additionally, if the assembly code is not properly written, it can lead to errors when trying to manipulate the strings in the array.

5. How can you debug issues with string arrays in assembly?

To debug issues with string arrays in assembly, you can use a debugger or an emulator to step through your code line by line and analyze the values stored in memory. You can also use print statements or a memory dump to check the values in the string array at different points in your code. Additionally, verifying that your code follows proper syntax and logic can help prevent errors with string arrays.

Similar threads

  • Programming and Computer Science
Replies
11
Views
4K
  • Programming and Computer Science
Replies
4
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
9
Views
3K
Back
Top